Engineered Cellular Pathways for Programmed Autoregulation of Differentiation

ABSTRACT

The present invention provides compositions and methods for programming mammalian cells to perform desired functions. In particular, the present invention provides compositions and methods for programming stem cells to differentiate into a desired cell type. A quorum sensing systems that regulates the expression of cell fate regulators is introduced into mammalian host cells, such as stem cells. The quorum sensing systems generally comprises vectors that express the components of a bacterial quorum sensing pathway, including proteins which catalyze the synthesis of an autoinducer and a gene encoding a regulatory partner of the autoinducer, and vectors in which genes encoding cell fate regulators are operably linked to a promoter induced by the autoinducer/regulatory partner complex. The system can also comprise vectors in which genes encoding additional cell fate regulators are operably linked to a promoter that is induced by a factor synthesized in response to a first stage of differentiation, so that a second stage of differentiation is triggered.

This application is a continuation of, and claims priority to, U.S. application Ser. No. 15/251,894, filed on Aug. 30, 2016, now abandoned, which is a continuation of U.S. application Ser. No. 14/174,475, filed on Feb. 6, 2014, and issued as U.S. Pat. No. 9,458,471 on Oct. 4, 2016, which is a continuation of U.S. application Ser. No. 12/312,197, filed on Apr. 29, 2009, and issued as U.S. Pat. No. 8,685,720 on Apr. 1, 2014, which is the U.S. National stage filing of PCT Application No. PCT/US2007/023227, filed on Nov. 1, 2007, now abandoned, which claims priority to U.S. provisional Patent Application Ser. No. 60/856,531 filed on Nov. 3, 2006, now abandoned, and to U.S. provisional Patent Application Ser. No. 60/905,483 filed on Mar. 7, 2007, now abandoned, each of which is herein incorporated by reference in its entirety for all purposes.

A Sequence Listing has been submitted in an ASCII text file named “16276” created on Oct. 25, 2019, consisting of 318,163 bytes, the entire content of which is herein incorporated by reference.

FIELD OF THE INVENTION

The present invention provides compositions and methods for programming mammalian cells to perform desired functions. In particular, the present invention provides compositions and methods for programming stem cells to differentiate into a desired cell type.

BACKGROUND OF THE INVENTION

Diabetes Mellitus is a heterogeneous mix of genetic abnormalities in the insulin-producing machinery ranging from the body's inability to produce enough insulin to the body's inability to recognize and/or use insulin. Type I diabetes is an autoimmune disease which systematically destroys insulin-producing β cells in the pancreas. Type II diabetes is caused by various genetic abnormalities in the pancreas and onset is directly correlated to obesity. The current standard treatment for diabetes is to maintain insulin levels by monitoring blood glucose and diet, to provide exogenous doses of insulin when necessary and to treat the consequences of diabetes such as loss of circulation to the extremities, glaucoma, and sepsis, as the disease progresses [Couper et al., Medical Journal of Australia, 179(8):441-447, 2003]. More radical treatments include full organ transplants, islet cell transplants or 3 cell transplants. Pancreatic transplantation candidates are put on a long waiting list for a suitable organ. Even when patients are lucky enough to be chosen for an allogeneic pancreatic organ transplant, they must take immunosuppressants in order to battle graft vs. host disease. A recent attempt to use islet cell transplant therapy provided short-lived relief in most patients but the transplanted P cells subsequently died or ceased to produce insulin in a majority of the initial successful transplants [Shapiro et al., New England Journal of Medicine, 355(13):1318-1330, 2006]. Clearly another approach is necessary to alleviate the problems caused by diabetes and address the root cause of the disease.

Recent developments in genome technologies, tissue engineering and synthetic biology offer possibilities to establish highly accurate and robust approaches for predictable and controllable cell fate regulation both temporally and spatially. Stem cell research promises to revolutionize the way many inherited and acquired diseases are treated and will also provide unprecedented insights into fetal development and the etiology of numerous disorders [Hochedlinger et al., N Engl J Med, 349(3):275 . . . 286, July 2003; Weissman, Science, 287(5457):1442.1446, February 2000; Lagasse et al., Immunity, 14(4):425-436, April 2001; Reya et al., Nature, 414(6859): 105-111, November 2001]. Mouse embryonic stem (mES) cells are an attractive platform for this research because they are amenable to extensive genetic manipulation. When introduced into the appropriate in vitro or in vivo contexts, mES cells contribute to all tissue types of adult mice, including the germ line [Nagy et al., Development, 110(3):815-821, November 1990].

Consequently, there has been much excitement about the potential of these cells as an unlimited source of differentiated cell populations for transplantation or other therapies. Although potentially exciting and ground-breaking, ES cell-based therapies depend on the ability to reliably and controllably produce the necessary mature cell populations. In addition, directed differentiation must be absolute, given the tumorigenic potential of ES cells. With few exceptions, such directed production of desired cell populations has not been possible yet.

Current approaches towards tissue engineering and transplantation rely on carefully creating environments that induce cells to differentiate into desired tissues or organs. While these approaches have proven partially effective for certain applications, they are inherently limited since they rely on innate cellular response to existing host conditions or exogenous cues. Often, naturally occurring host conditions are insufficient to trigger the correct differentiation pathways. In those instances, researchers have attempted to provide appropriate environmental cues using scaffolds and exogenous signals. However, it is often difficult, if not impossible, to create and maintain the precise conditions that are required for tissue re-generation using such means.

What is needed in the art are systems and methods that can be used to cause stem cells to reliably differentiate into a desired cell type based on expression of genes introduced into the stem cells.

SUMMARY OF THE INVENTION

The present invention provides compositions and methods for programming mammalian cells to perform desired functions. In particular, the present invention provides compositions and methods for programming stem cells to differentiate into a desired cell type. Accordingly, in some embodiments, the present invention provides systems for directing differentiation of an undifferentiated cell type comprising: a) a first mammalian vector comprising a first gene encoding a protein that synthesizes an autoinducer; b) a second mammalian vector comprising a second gene encoding a regulatory protein that interacts with the autoinducer; c) a third mammalian vector comprising a promoter that binds to the regulatory protein in the presence of the autoinducer, the promoter operably linked to a third gene of interest encoding a first cell fate regulator. In some embodiments, the systems of the present invention further comprise a fourth mammalian vector comprising a promoter comprising a response element that binds to a regulatory protein produced in response to expression of the first cell fate regulator. The present invention is not limited to the use of any particular type of mammalian vector. Indeed, the use of a variety of mammalian vectors is contemplated, including, but not limited to lentiviral vectors, retroviral vectors, pseudotyped retroviral or lentiviral vectors, adenoviral vectors, AAV vectors, plasmids, artificial chromosomes, transposon vectors and the like.

In some embodiments, the first mammalian vector further comprises a promoter operably linked to the first gene. The present invention is not limited to the use of any particular promoter. Indeed, the use of a variety of promoters is contemplated. In some embodiments, the promoter is a repressible promoter. In some embodiments the promoter comprises a lac repressor. In some embodiments, the promoter is an inducible promoter. In some embodiments, the inducible promoter is a Tet promoter. The present invention is not limited to vectors encoding genes that synthesize any particular autoinducer. Indeed, the use of a variety of autoinducers is contemplated, including, but not limited to 30C6HSL, C4HSL, and 3OC14HSL. The present invention is not limited to the use of genes encoding any particular autoinducer. Indeed, the use of several genes that encode proteins that catalyze the synthesis of autoinducers is contemplated, included but not limited to the LuxI, RhlI, and CinI proteins. The present inventions is not limited to the use of the any particular genes encoding LuxI, RhiI, or CinI. In some embodiments, the genes comprise codons that are optimized for expression in mammalian cells. In other embodiments, the genes utilized are at least 70%, 80%, 90% or 95% identical to the wild type LuxI, RhlI or CinI wild-type genes.

In some preferred embodiments, the second mammalian vector comprises a gene encoding a regulatory protein that interacts with the autoinducer synthesized by the protein encoded by the gene of interest in the first vector. The present invention is not limited to the use of any particular regulatory protein. Indeed, the use of a variety of regulatory proteins is contemplated, including, but not limited to LuxR, RhlR, and CinR. The present inventions is not limited to the use of the any particular genes encoding LuxR, RhlR, or CinR. In some embodiments, the genes comprise codons that are optimized for expression in mammalian cells. In other embodiments, the genes utilized are at least 70%, 80%, 90% or 95% identical to the wild type LuxR, RhlR or CinR wild-type genes.

The present invention is not limited to the use of any particular promoter in the third vector of the system of the present invention. In some embodiments, the third mammalian vector comprises a promoter selected from the group consisting of lux and LasR promoters. In some preferred embodiments, the lux promoter comprises multiple repeats of the lux binding sequence, which binds LuxR/autoinducer complex. Likewise, the vectors for use in the systems of the present invention may comprise promoters comprising multiple repeats of the binding sequences recognized by the RhlI and CinI autoinducer complexes. The present invention is not limited to the use of vectors encoding any particular cell fate regulator. In some embodiments, the first cell fate regulator is selected from the group consisting of Sox17, Gata4, and Gata6 and combinations thereof. In further embodiments, the vectors used in the systems of the present invention encode two or more of the cell fate regulators selected from the group consisting of Sox17, Gata4, Gata6, Pdx1, Ngn3, Nkx6.1, Nkx2.2, Fgf4, BRA, Wnt9, NCAD, CER, FoxA2, CxcR4, Hnf1B, Hnf4A, Hnf6, HlxB9, Pax4, Cgc, GHRL, SST, PPY, Activin, Fgf10, Cyc, RA, Ex4, DAPT, HGF and Igf1.

In some embodiments of the systems of the present invention, the fourth mammalian vector further comprises a gene encoding a second cell fate regulator. The present invention is not limited to the use of any particular second cell fate messenger. In some embodiments, the second cell fate regulator is selected from the group consisting of Pdx1, Ngn3, Nkx6.1, Nkx2.2, Fgf4, BRA, Wnt9, NCAD, CER, FoxA2, CxcR4, Hnf1B, Hnf4A, Hnf6, HlxB9, Pax4, Cgc, GHRL, SST, PPY, Activin, Fgf10, Cyc, RA, Ex4, DAPT, HGF and Igf1. In some embodiments, the systems of the present invention comprise multiple vectors encoding one or more of the foregoing cell fate regulators. The fourth vectors of the systems of the present invention are not limited to the use of any particular promoter. Indeed, a variety of stage specific promoters may be utilized. In some embodiments, the promoter comprising a response element that binds to a regulatory protein produced in response to expression of the first cell fate regulator comprises an α-fetoprotein promoter.

In some embodiments, the systems of the present invention comprise vectors encoding additional proteins involved in a quorum sensing pathway. Accordingly, in some embodiments, the systems of the present invention comprise a fifth mammalian vector comprising a gene encoding an acyl carrier protein (ACP). In further embodiments, the systems of the present invention comprise a sixth mammalian vector comprising a gene encoding acyl-acyl carrier protein synthase (AAS). The present invention is not limited to vectors comprising any particular ACP or AAS genes. In some embodiments, the genes comprise codons optimized for expression in a mammalian cells. In further embodiments, the genes are at least 70%, 80%, 90% or 95% identical to the wild type ACP or ASS genes.

In some embodiments, the systems of the present invention further comprise a separate vector encoding comprising a gene encoding RhlR. In some embodiments, the systems of the present invention further comprise a separate mammalian vector comprising a gene encoding Growth Arrest Factor. In some embodiments, the systems of the present invention further comprise a mammalian vector comprising a gene encoding TetR. In some embodiments, the systems of the present invention further comprise a separate mammalian vector comprising a gene of interest encoding a protein selected from the group consisting of TetR, LacI, and CinI operably linked to a Mouse Insulin promoter.

In some embodiments, the present invention provides a culture of mammalian cells comprising one or more of the foregoing vectors of the system of the present invention. In some embodiments, the present invention provides a mammalian cell comprising one or more of the foregoing vectors of the system of the present invention. The present invention is not limited to the use of any particular type of mammalian cells. In some embodiments, the cell is totipotent cell. In some embodiments, the cell is a multipotent cell. In some embodiments, the cell is a differentiated cell. In some embodiments, the cell or cell culture comprises cells selected from the group consisting of embryonic stem cells, adult stem cells or cord blood stem cells. The present invention is not limited to mammalian cells of any particular species. In some embodiments, the cells are human, primate, mouse, cow, hamster, rat, pig, sheep or goat cells. In some embodiments, the differentiated cells are β-cells. In some embodiments, the cells produce insulin.

In some embodiments, the present invention provides methods of treating a patient comprising: providing a patient in need of treatment and the mammalian cells described in the foregoing paragraphs, and introducing the cells into the patient. In some embodiments, the patient is diabetic. In some embodiments, the mammalian cells produce insulin after introduction into the patient.

In some embodiments, the present invention provides methods of programming mammalian cells comprising: introducing a quorum sensing system into a mammalian cell, wherein the system causes production of an autoinducer molecule and a regulatory partner of the autoinducer; introducing an expression system for at least one cell fate regulator into the mammalian cell; culturing the cell under conditions such that the cell produces the autoinducer, wherein the autoinducer interacts with the regulatory partner to induce expression of the at least one cell fate regulator.

In some embodiments, the present invention provides methods of programming mammalian cells comprising 1) providing i) a mammalian cell, ii) a first vector encoding a protein that synthesizes an autoinducer molecule; iii) a second vector encoding a regulatory partner of the autoinducer; iv) a third vector encoding a gene of interest operably linked to a promoter activated by the interaction of the autoinducer and the regulatory partner; 2)

introducing the vectors into the mammalian cell; 3) culturing the mammalian cell under conditions such that the regulatory partner and the autoinducer are synthesized and activate expression of the gene of interest.

In some embodiments, the present invention provides a mammalian cell comprising an exogenous gene selected from the group consisting of genes encoding LuxI, LuxR, RhlI, RhlR, CinI, and CinR. In some embodiments, the present invention provides a mammalian cell comprising an exogenous gene selected from the group consisting of Sox17, Gata4, and Gata6, wherein the exogenous gene is operably linked to a lux promoter. In some embodiments, the present invention provides a mammalian cell comprising an exogenous gene selected from the group consisting of Pdx1, Ngn3, Nkx6.1, Nkx2.2, Fgf4, BRA, Wnt9, NCAD, CER, FoxA2, CxcR4, Hnf1B, Hnf4A, Hnf6, HlxB9, Pax4, Cgc, GHRL, SST, PPY, Activin, Fgf10, Cyc, RA, Ex4, DAPT, HGF and Igf1, wherein the exogenous gene is operably linked to a endoderm specific promoter. In some embodiments, the endoderm specific promoter is the α-fetoprotein promoter.

In some embodiments, the present invention provides a mammalian cell comprising a quorum sensing pathway. In some embodiments, the quorum sensing pathway comprises one or more exogenous genes encoding Lux1, LuxR, acyl carrier protein and acyl-acyl carrier protein synthase.

In some embodiments, the present invention provides a system for use in establishing tissue homeostasis in a differentiating cell system comprising mammalian vectors encoding at least one cell-cell communication system, wherein said at least a first cell-cell communication system controls proliferation or differentiation of cells in said differentiating cell system. In some embodiments, the systems further comprise at least a second cell-cell communication system, wherein said first and second cell-cell communication systems interact to establish homeostasis in said differentiating cell system. In some embodiments, the differentiating cells system comprises differentiable cells and differentiated cells produced from said differentiable cells. In some embodiments, the homeostasis is characterized by the controlled proliferation of said differentiable cells comprising said at least two cell-cell communication systems and the controlled production of said differentiated cells. In further embodiments, the at least two cell-cell communication systems interact to establish tissue homeostasis in a differentiating cell system comprising differentiable cells and differentiated cells by sensing the number of differentiating cells via the first cell-cell communication system and the number of differentiated cells via the second cell-cell communication system so that when the number of differentiable cells is low proliferation of the differentiable cells is induced, when the number of differentiable cells is high the proliferation of the differentiable is cells in inhibited, when the number of differentiated cells is low the differentiation of the differentiable cells is induced and the proliferation of the differentiable cells is induced, and when the number of differentiated cells is high the differentiation of the differentiable cells is inhibited. The present invention is not limited to the use of any particular cell-cell communication systems. Indeed, the use of a variety of cell-cell communication systems is contemplated. In some embodiments, the at least one cell-cell communication system is a bacterial cell-cell communication systems. In other embodiments, the at least one cell-cell communication system is selected from the group consisting of the LuxI, RhlI, and CinI cell-cell communication systems. The present invention is not limited to the use of any particular type of vector. Indeed, the use of a variety of vectors is contemplated. In some embodiments, the mammalian vectors are retroviral vectors. In further embodiments, the systems of the present invention further comprise a cell differentiation control system.

In some embodiments, the present invention provides a differentiable cell comprising the foregoing systems, wherein said differentiable cell is cultured in vitro. In further embodiments, the present invention provides a differentiable mammalian cell comprising at least a first cell-cell communication pathway, wherein said first cell-cell communication pathway is encoded by exogenous genes. In some embodiments, the cells further comprise at least a second cell communication pathway. In some embodiments, the cell differentiates into a target differentiated cell. The present invention is not limited to any particular type of cell. Indeed, the use of a variety of cell types is contemplated. In some embodiments, the cell is a pluripotent, multipotent or totipotent cell. The present invention is not limited to the use of any particular cell-cell communication systems. Indeed, the use of a variety of cell-cell communication systems is contemplated. In some embodiments, the at least one cell-cell communication system is a bacterial cell-cell communication systems. In other embodiments, the at least one cell-cell communication system is selected from the group consisting of the LuxI, RhlI, and CinI cell-cell communication systems. In some embodiments, the cell further comprises an exogenous cell differentiation pathway. In some embodiments, the first cell-cell communication pathway is a CinI/CinR cell-cell communication pathway. In some embodiments, the CinI/CinR cell-cell communication pathway comprises at least a first gene encoding CinI, a second gene encoding CinR, and a third gene of interest operably linked to a CinR/30C14HSL inducible promoter. In some embodiments, the second cell-cell communication pathway is a RhlI/RhlR cell-cell communication pathway. In some embodiments, the RhlI/RhlR cell-cell communication pathway comprises at least a first gene encoding RhlI, a second gene encoding RhlR, and a third gene of interest operably linked to a RhlR/C4HSL inducible promoter. In some embodiments, the cell comprises a cell differentiation pathway. In some embodiments, the cell differentiation pathway comprises at least one exogenous gene encoding a cell fate regulator. The present invention is not limited to the use of any particular cell fate regulator. Indeed, the use of a variety of cell fate regulators is contemplated, including, but not limited to, Sox17, Gata4, Gata6, Pdx1, Ngn3, Nkx6.1, Nkx2.2, Fgf4, BRA, Wnt9, NCAD, CER FoxA2, CxcR4, Hnf1B, Hnf4A, Hnf6, HlxB9, Pax4, Cgc, GHRL, SST, PPY, Activin, Fgf10, Cyc, RA, Ex4, DAPT, HGF and Igf1. In some embodiments, the third gene of interest operably linked to a RhlR/C4HSL inducible promoter is a protein that inhibits growth in a cell. In further embodiments, the protein that inhibits growth in a cell is Growth Arrest Factor. In some embodiments, the third gene of interest operably linked to a CinR/30C14HSL inducible promoter is a repressor. In some embodiments, the repressor is lambda repressor. In some embodiments, the first and second cell-cell communication pathways interact to control proliferation and differentiation of said mammalian cell. In some embodiments, the cell differentiation pathway comprises at least one gene encoding a protein that causes said cell to differentiate into a target differentiated cell. In some embodiments, the target differentiated cell is a beta cell. In some embodiments, the second cell-cell communication pathway is activated in said target differentiated cell and comprises at least one gene encoding a protein that inhibits the proliferation of undifferentiated cells comprising said first and second cell-cell communication pathways. In some embodiments, the cell differentiation pathway is activated in said target differentiated cell and comprises at least one gene encoding a protein that regulates expression of said at least one gene encoding a protein that causes undifferentiated cells comprising said first and second cell-cell communication pathways to differentiate into a target differentiated cell. In some embodiments, the gene encoding a protein that regulates expression of said at least one gene encoding a protein that causes undifferentiated cells comprising said first and second cell-cell communication pathways to differentiate into a target differentiated cell is a repressor. In some embodiments, the repressor is a lambda repressor. In some embodiments, the cell is maintained in vitro.

The present invention further provides methods of controlling proliferation and differentiation of a differentiable cell comprising: a) providing: i) a differentiable cell; and ii) at least one cell-cell communication pathway; and b) introducing said at least one cell-cell communication pathway into said differentiable cell so that when said at least one cell-cell communication pathway is expressed in said differentiable cell, the proliferation and differentiation of said differentiable cell is controlled. In some embodiments, the two cell-cell communication pathways are introduced into said differentiable cells and wherein said two cell communication pathways interact to control proliferation and differentiation of said differentiable cell and proliferation of target differentiated cells that differentiate from said differentiable cells. In some embodiments, the at least one of said two cell-cell communication pathways provides regulatory feedback on differentiation of said differentiable cells. The present invention is not limited to the use of any particular cell-cell communication pathway. Indeed, the use of a variety of cell-cell communication pathways is contemplated. In some embodiments, the at least one cell-cell communication pathway is a bacterial cell-cell communication systems. In other embodiments, the at least one cell-cell communication pathway is selected from the group consisting of the LuxI, RhlI, and CinI cell-cell communication systems.

In some embodiments, the present invention provides methods of treating a subject comprising: a) providing a plurality of mammalian cells as described above; and b) introducing said cells into a subject under conditions such that the proliferation and differentiation of said cells is controlled to provide a source of differentiated target cells in said subject. In some embodiments, the subject is human.

In some embodiments, the present invention provides a symmetry breaking system for mammalian cells comprising a first vector comprising an activator operably linked to a promoter responsive to the activator and a repressor and a second vector encoding a repressor operably linked to promoter responsive to said activator. In some embodiments, the present invention provides a population of cells comprising the symmetry breaking system, wherein expression of the activator causes activation of the activator and the repressor and expression of the repressor causes repression of the activator, so that at any given time only a portion of the cells within the population have high levels of expression of the activator as compared to the repressor.

In some embodiments, the present invention provides a cascade system for mammalian cells comprising a first vector encoding a first repressor operably linked to a promoter responsive to a first activator, a second vector encoding a second repressor operably linked to promoter repressed by said first repressor, and a third vector encoding a second activator operably linked to a promoter repressed by said second repressor, and a fourth vector encoding a repressor operably linked to a promoter that is activated by said second activator and repressed by a third repressor. In some embodiments, the present invention provides a population of cells comprising the cascade system.

In some embodiments, the present invention provides a toggle switch system for mammalian cells comprising a first vector encoding first repressor operably linked to a promoter repressed by a second repressor and a second vector encoding said second repressor operably linked to a promoter repressed by said first repressor. In some embodiments, the first repressor is TetR and said second repressor is LacI. In some embodiments, the present invention provides cells comprising the toggle switch system.

In some embodiments, the present invention provides a population of uncommitted and committed cells comprising: a) a cell population control module comprising a first cell-cell communication pathway; b) a cell commitment module comprising a symmetry breaking system and a second cell-cell communication system; and c) a cell differentiation module; wherein said cell population control module senses the concentration of uncommitted cells in the population via said first cell-cell communication system, the cell commitment module senses the concentration of committed cells in the population via said second cell-cell communication system and controls which cells within the population are allowed to commit via said symmetry breaking system, so that when the concentration of uncommitted cells in the population is high, the concentration of committed cells in the population is low, and there are cells that allowed to commit, said cell differentiation module is activated. In some embodiments, the cell communication module further comprises a cascade system and a toggle switch system that interact with said symmetry breaking system and said second cell-cell communication system to control the commitment of cells within the population to differentiate. In some embodiments, the cells further comprise an apoptosis module. In some embodiments, the apoptosis module comprises a vectors encoding a repressor operably linked to a tissue-specific promoter and a vector encoding an apoptosis gene operably linked to a promoter regulated by said repressor, so that said apoptosis gene is not expressed in a tissue wherein said tissue-specific promoter is active.

In some embodiments, the present invention provides a fusion protein comprising a secretion signal, cell penetrating polypeptide, and trans acting domain in operable combination, wherein at least one of said secretion signal, cell penetrating polypeptide and trans acting domain are from different proteins. The present invention is not limited to the use of any particular secretion signal. Indeed, the use of a variety of secretion signals is contemplated, including, but not limited to IgG, t PA, serum albumin, lactoferrin, and growth hormone secretion signals. The present invention is not limited to the use of any particular cell penetrating polypeptide. Indeed, the use of a variety of cell penetrating polypeptides is contemplated, including, but not limited to, TAT cell penetrating polypeptide and penetratin. The present invention is not limited to the use of any particular trans-acting domain. Indeed, the use of variety of trans-acting domains is contemplated, including, but not limited to, zinc finger binding domains. In further embodiments, the present invention provides nucleic acids encoding the fusion proteins. In further embodiments, the present invention provides a vector comprising the nucleic acid encoding the fusion protein. In still further embodiments, the present invention provides a cell comprising the nucleic acid.

In some embodiments, the present invention provides a cell-cell communication system comprising: a) a first vector comprising a nucleic acid encoding a fusion protein comprising a secretion signal, cell penetrating polypeptide, and trans-acting domain in operable combination and b) a second vector comprising a nucleic acid encoding a promoter comprising an element that binds said trans-acting domain operably linked to a protein of interest. In further embodiments, the present invention provides a population of cells comprising the system.

DESCRIPTION OF THE FIGURES

FIGS. 1A-1B: Autoregulated quorum sensing based two-step differentiation from mES to endoderm to β cells. (FIG. 1a ) Circuit diagram of the system. Details are in the text. Gata4/Sox17 activates the α-fetoprotein promoter (pAFP) indirectly by stimulating expression of AFP (dashed line). Likewise, Pdx1 and Ngn3 activate the Mouse Insulin Promoter (MIP) indirectly by stimulating insulin production (dashed lines). AFP production results in green fluorescence and insulin production results in red fluorescence. (FIG. 1b ) Progress of system from a single cell to a collection of β cells. Due to quorum sensing, a population of mES cells is necessary in order to differentiate into endoderm.

FIG. 2A-2C: (FIG. 2a ) 30C6HSL synthesis by mammalian 293FT cells. Supernatant from the cells' growth media was collected and filter sterilized. Histograms show the fluorescence intensities of bacterial cells sensitive to 30C6HSL grown in the filtered supernatant, expressing green fluorescent protein due to 30C6HSL from the mammalian cells. (FIG. 2b ) 30C6HSL detection by mammalian receiver cells. 293FT Cells express DsRed constitutively, but express EGFP only upon induction with exogenous 30C6HSL. (FIG. 2c ) Controlled stem cell differentiation with genetically inducible systems. Three different lentiviral genetic constructs were built, enabling exogenous chemical induction into myoblasts, adipocytes, or maintenance of stem cell properties. Dox-induced mES cells expressing MyoD causes visible changes to a myoblast morphology. In addition, the marker protein for myoblasts, dystrophin, was expressed from these cells (data not shown). PPARγ expression resulted in an adipocyte morphology. In a separate experiment, induced expression of Nanog supplanted the need for a growth factor usually required to maintain the overall character of mES cells.

FIG. 3: Diagram of a four module system for cell proliferation and differentiation control.

FIG. 4A-4B: (FIG. 4a ) shRNA is transduced into engineered mES cells and cells are allowed to differentiate into pancreatic β cells. (FIG. 4b ) Before differentiation, shRNAs will be uniformly present in mES cells. After differentiation there will be different pools of cell types (undifferentiated mES cells, endodermal cells or pancreatic β cells, e.g.) depending on the effect of the transduced shRNAs. The resulting shRNA ratios compared with the shRNA ratios before differentiation will allow identification of genes necessary for differentiation.

FIG. 5A-5D provides the sequence (SEQ ID NO:01) for the LuxI vector-pLV-Hef1a-LuxIm-IRES2-DsRed2.

FIG. 6 provides a diagram of the vector of FIG. 5A-5D.

FIG. 7A-7E provides a description and sequence (SEQ ID NO:02) for the LuxR vector-pLV-Hef1a-p65H4LuxRFm-IRES2-DsRed2.

FIG. 8 provides a diagram of the vector of FIG. 7A-7E.

FIG. 9A-9E provides a description and sequence (SEQ ID NO:03) for the lux promoter vector for expression of Gata4/Sox17-pLV-minCMVLuxO7-IRES2-EGFP.

FIG. 10 provides a diagram of the vector of FIG. 9A-9E.

FIG. 11A-1E provides a description and sequence (SEQ ID NO:04) for the ACP vector-pLV-Hef1a-ACPm-IRES2-DsRed2

FIG. 12 provides a diagram of the vector of FIG. 11A-11E.

FIG. 13A-13F provides a description and sequence (SEQ ID NO:05) for the AAS vector-pLV-Hef1a-AAS-IRES2-EGFP.

FIG. 14 provides a diagram of the vector of FIG. 13A-13F.

FIG. 15A-15E provides a description and sequence (SEQ ID NO:06) for the AFP promoter vector PDX1-pLV-AFP-Pdx1-IRES2-DsRed2.

FIG. 16 provides a diagram of the vector of FIG. 15A-15E.

FIG. 17A-17E provides a description and sequence (SEQ ID NO:07) for the AFP promoter vector Ngn3-pLV-AFP-Ngn3-IRES2-DsRed2.

FIG. 18 provides a diagram of the vector of FIG. 17A-17E.

FIG. 19A-19E provides a description and sequence (SEQ ID NO:08) for the AFP promoter vector for TetRKRAB-pLV-AFP-TetRKRAB-IRES2-DsRed2.

FIG. 20 provides a diagram of the vector of FIG. 19A-19E.

FIG. 21A-21E provides a description and sequence (SEQ ID NO:09) for the RhlI vector-pLV-Hef1a-RhlI-IRES2-DsRed2.

FIG. 22 provides a diagram of the vector of FIG. 21A-21E.

FIG. 23A-23H provides a diagram and sequence (SEQ ID NO:10) for the vector pLV-MIP-LacIKRAB-IRES2-EGFP.

FIG. 24A-24G provides a diagram and sequence (SEQ ID NO: 11) for the vector pLV-MIP-IRES2-EGFP.

FIG. 25A-25G provides a diagram and sequence for SEQ ID NO: 12) for the vector pLV-TetRKRAB-IRES2-Puro.

FIG. 26A-26B provides a diagram of a synthetic signaling circuit utilizing cell penetrating polypeptide elements.

FIG. 27A-27B provides a diagram of a synthetic signaling circuit utilizing cell penetrating polypeptide elements.

FIG. 28A-28G provides a plasmid and sequence for P_A1_R1/A1 (SEQ ID NO: 13).

FIG. 29A-29G provides a plasmid and sequence for p_tetO/LacI (SEQ ID NO: 14).

FIG. 30A-30F provides a plasmid and sequence for p._RhlI_R3/A2 (SEQ ID NO: 15).

FIG. 31A-31F provides a plasmid and sequence for P_Cin/CinR (SEQ ID NO: 16).

FIG. 32A-32F provides a plasmid and sequence for P_Cin/CI (SEQ ID NO: 17).

FIG. 33A-33F provides a plasmid and sequence for P_Cin_tetO/CinI (SEQ ID NO: 18).

FIG. 34A-34G provides a plasmid and sequence for P_lacO_R4/TetR (SEQ ID NO: 19).

FIG. 35A-35F provides a plasmid and sequence for P_A1/R1 (SEQ ID NO: 20).

FIG. 36A-36F provides a plasmid and sequence for P_A1/R2 (SEQ ID NO: 21).

FIG. 37A-37G provides a plasmid and sequence for P_R2/R3 (SEQ ID NO: 22).

FIG. 38A-38F provides a plasmid and sequence for P_A2/CIOR/R4 (SEQ ID NO: 23).

DEFINITIONS

As used herein, the term “quorum sensing” refers to the detection of the density of a cell type by means of cell-cell communications. In some embodiments, “quorum sensing” is associated with an action upon achievement of a certain cell density.

As used herein, the term “quorum sensing system” refers to a system of genes that provides components for quorum sensing.

As used herein, the term “totipotent” means the ability of a cell to differentiate into any type of cell in a differentiated organism.

As used herein, the term “pluripotent” refers to a cell line capable of differentiating into several differentiated cell types.

As used herein, the term “multipotent” refers to a cell line capable of differentiating into at least two differentiated cell types.

As used herein, the term “host cell” refers to any eukaryotic cell (e.g., mammalian cells, avian cells, amphibian cells, plant cells, fish cells, and insect cells), whether located in vitro or in vive.

As use herein, the term “stem cells” means cells that are totipotent or pluripotent and are capable of differentiating into one or more different cell types.

As use herein, the term “embryonic stem cells” means stem cells derived from an embryo.

As used herein, the term “adult stem cells” means stem cells derived from an organism after birth.

As used herein, the term “mesodermal cell line” means a cell line displaying characteristics associated with mesodermal cells.

As used herein, the term “endodermal cell line” means a cell line displaying characteristics normally associated with endodermal cells.

As used herein, the term “neural cell line” means a cell line displaying characteristics normally associated with neural cell lines. Examples of such characteristics include, but are not limited to, expression of GFAP, neuron-specific enolase, Neu-N, neurofilament-N, or tau.

As used herein the term “differentiable cell” refers to a cell that can differentiate into another cell type and includes multipotent, pluripotent and totipotent cells.

As used herein, the term “target differentiated cell” refers to a predetermined cell type that differentiates from a differentiable cell.

As used herein, the term “differentiating cell system” refers to a population of differentiable cells and target differentiated cells.

As used herein, the term “tissue homeostasis” refers to a steady state achieved between a population of differentiable cells and target differentiated cells in a differentiating cell system wherein cell-cell communication between the differentiable cells and the target differentiated cells controls the number of differentiable cells and target differentiated cells in the system.

As used herein, the term “proliferation” as used with respect to cells in a differentiating cell system refers to the production of cells of like type from a given cell population, such as the production of additional differentiable cells from a population of differentiable cells.

As used herein, the term “differentiation” as used with respect to cells in a differentiating cell system refers to the process by which cells differentiate from one cell type (e.g., a multipotent, totipotent or pluripotent differentiable cell) to another cell type such as a target differentiated cell (e.g., a beta cell).

The term “cell-cell communication pathway” refers to a network of two or more genes encoding proteins (e.g., LuxR, CinR, RhlR) and proteins that synthesize products (e.g., 3OC₆HSL, C₁₄HSL, C₄HSL) that are involved in cell-cell communication. Exemplary cell-cell communication pathways include, but are not limited to the LuxI/LuxR, CinI/CinR, RhlI/RhlR. As a further example, the LuxI/LuxR cell-cell communication pathway can include the LuxI and LuxR genes. Additionally the LuxI/LuxR cell-cell communication pathway can include a gene of interest operably linked to promoter induced by LuxR/3OC₆HSL. The CinI/CinR cell-cell communication pathway can include the CinI and CinR genes. Additionally the CinI/CinR cell-cell communication pathway can include a gene of interest operably linked to promoter induced by CinR/C₁₄HSL. The RhlI/RhlR cell-cell communication pathway can include the RhlI and RhlR genes. Additionally the RhlI/RhlR cell-cell communication pathway can include a gene of interest operably linked to promoter induced by RhlR/C₄HSL.

As used herein, the term “cell culture” refers to any in vitro culture of cells. Included within this term are continuous cell lines (e.g., with an immortal phenotype), primary cell cultures, finite cell lines (e.g., non-transformed cells), and any other cell population maintained in vitro, including oocytes and embryos.

As used herein, the term “vector” refers to any genetic element, such as a plasmid, phage, transposon, cosmid, chromosome, virus, virion, etc., which is capable of replication when associated with the proper control elements and which can transfer gene sequences between cells. Thus, the term includes cloning and expression vehicles, as well as viral vectors.

As used herein, the term “multiplicity of infection” or “MOI” refers to the ratio of integrating vectors:host cells used during transfection or transduction of host cells. For example, if 100,000 vectors are used to transduce 100,000 host cells, the multiplicity of infection is 1. The use of this term is not limited to events involving transduction, but instead encompasses introduction of a vector into a host by methods such as lipofection, microinjection, calcium phosphate precipitation, and electroporation.

As used herein, the term “genome” refers to the genetic material (e.g., chomosomes) of an organism.

The term “nucleotide sequence of interest” refers to any nucleotide sequence (e.g., RNA or DNA), the manipulation of which may be deemed desirable for any reason (e.g., treat disease, confer improved qualities, expression of a protein of interest in a host cell, expression of a ribozyme, etc.), by one of ordinary skill in the art. Such nucleotide sequences include, but are not limited to, coding sequences of structural genes (e.g., reporter genes, selection marker genes, oncogenes, drug resistance genes, growth factors, etc.), and non-coding regulatory sequences which do not encode an mRNA or protein product (e.g., promoter sequence, polyadenylation sequence, termination sequence, enhancer sequence, etc.).

As used herein, the term “protein of interest” refers to a protein encoded by a nucleic acid of interest.

As used herein, the term “exogenous gene” refers to a gene that is not naturally present in a host organism or cell, or is artificially introduced into a host organism or cell.

The term “gene” refers to a nucleic acid (e.g., DNA or RNA) sequence that comprises coding sequences necessary for the production of a polypeptide or precursor (e.g., proinsulin). The polypeptide can be encoded by a full length coding sequence or by any portion of the coding sequence so long as the desired activity or functional properties (e.g., enzymatic activity, ligand binding, signal transduction, etc.) of the full-length or fragment are retained. The term also encompasses the coding region of a structural gene and includes sequences located adjacent to the coding region on both the 5′ and 3′ ends for a distance of about 1 kb or more on either end such that the gene corresponds to the length of the full-length mRNA. The sequences that are located 5′ of the coding region and which are present on the mRNA are referred to as 5′ untranslated sequences. The sequences that are located 3′ or downstream of the coding region and which are present on the mRNA are referred to as 3′ untranslated sequences. The term “gene” encompasses both cDNA and genomic forms of a gene. A genomic form or clone of a gene contains the coding region interrupted with non-coding sequences termed “introns” or “intervening regions” or “intervening sequences.” Introns are segments of a gene that are transcribed into nuclear RNA (hnRNA); introns may contain regulatory elements such as enhancers. Introns are removed or “spliced out” from the nuclear or primary transcript; introns therefore are absent in the messenger RNA (mRNA) transcript. The mRNA functions during translation to specify the sequence or order of amino acids in a nascent polypeptide.

As used herein, the term “gene expression” refers to the process of converting genetic information encoded in a gene into RNA (e.g., mRNA, rRNA, tRNA, or snRNA) through “transcription” of the gene (i.e., via the enzymatic action of an RNA polymerase), and for protein encoding genes, into protein through “translation” of mRNA. Gene expression can be regulated at many stages in the process. “Up-regulation” or “activation” refers to regulation that increases the production of gene expression products (i.e., RNA or protein), while “down-regulation” or “repression” refers to regulation that decrease production. Molecules (e.g., transcription factors) that are involved in up-regulation or down-regulation are often called “activators” and “repressors,” respectively.

Where “amino acid sequence” is recited herein to refer to an amino acid sequence of a naturally occurring protein molecule, “amino acid sequence” and like terms, such as “polypeptide” or “protein” are not meant to limit the amino acid sequence to the complete, native amino acid sequence associated with the recited protein molecule.

As used herein, the terms “nucleic acid molecule encoding,” “DNA sequence encoding,” “DNA encoding,” “RNA sequence encoding,” and “RNA encoding” refer to the order or sequence of deoxyribonucleotides or ribonucleotides along a strand of deoxyribonucleic acid or ribonucleic acid. The order of these deoxyribonucleotides or ribonucleotides determines the order of amino acids along the polypeptide (protein) chain. The DNA or RNA sequence thus codes for the amino acid sequence.

As used herein, the terms “complementary” or “complementarity” are used in reference to polynucleotides (i.e., a sequence of nucleotides) related by the base-pairing rules. For example, the sequence “5′-A-G-T-3′,” is complementary to the sequence “3′-T-C-A-5′.” Complementarity may be “partial,” in which only some of the nucleic acids' bases are matched according to the base pairing rules. Or, there may be “complete” or “total” complementarity between the nucleic acids. The degree of complementarity between nucleic acid strands has significant effects on the efficiency and strength of hybridization between nucleic acid strands. This is of particular importance in amplification reactions, as well as detection methods that depend upon binding between nucleic acids.

The terms “homology” and “percent identity” when used in relation to nucleic acids refers to a degree of complementarity. There may be partial homology (i.e., partial identity) or complete homology (i.e., complete identity). A partially complementary sequence is one that at least partially inhibits a completely complementary sequence from hybridizing to a target nucleic acid sequence and is referred to using the functional term “substantially homologous.” The inhibition of hybridization of the completely complementary sequence to the target sequence may be examined using a hybridization assay (Southern or Northern blot, solution hybridization and the like) under conditions of low stringency. A substantially homologous sequence or probe (i.e., an oligonucleotide which is capable of hybridizing to another oligonucleotide of interest) will compete for and inhibit the binding (i.e., the hybridization) of a completely homologous sequence to a target sequence under conditions of low stringency. This is not to say that conditions of low stringency are such that non-specific binding is permitted; low stringency conditions require that the binding of two sequences to one another be a specific (i.e., selective) interaction. The absence of non-specific binding may be tested by the use of a second target which lacks even a partial degree of complementarity (e.g., less than about 30% identity); in the absence of non-specific binding the probe will not hybridize to the second non-complementary target.

The art knows well that numerous equivalent conditions may be employed to comprise low stringency conditions; factors such as the length and nature (DNA, RNA, base composition) of the probe and nature of the target (DNA, RNA, base composition, present in solution or immobilized, etc.) and the concentration of the salts and other components (e.g., the presence or absence of formamide, dextran sulfate, polyethylene glycol) are considered and the hybridization solution may be varied to generate conditions of low stringency hybridization different from, but equivalent to, the above listed conditions. In addition, the art knows conditions that promote hybridization under conditions of high stringency (e.g., increasing the temperature of the hybridization and/or wash steps, the use of formamide in the hybridization solution, etc.).

When used in reference to a double-stranded nucleic acid sequence such as a cDNA or genomic clone, the term “substantially homologous” refers to any probe that can hybridize to either or both strands of the double-stranded nucleic acid sequence under conditions of low stringency as described above.

When used in reference to a single-stranded nucleic acid sequence, the term “substantially homologous” refers to any probe that can hybridize (i.e., it is the complement of) the single-stranded nucleic acid sequence under conditions of low stringency as described above.

As used herein, the term “hybridization” is used in reference to the pairing of complementary nucleic acids. Hybridization and the strength of hybridization (i.e., the strength of the association between the nucleic acids) is impacted by such factors as the degree of complementary between the nucleic acids, stringency of the conditions involved, the T_(m) of the formed hybrid, and the G:C ratio within the nucleic acids. A single molecule that contains pairing of complementary nucleic acids within its structure is said to be “self-hybridized.”

As used herein, the term “T_(m)” is used in reference to the “melting temperature” of a nucleic acid. The melting temperature is the temperature at which a population of double-stranded nucleic acid molecules becomes half dissociated into single strands. The equation for calculating the T_(m) of nucleic acids is well known in the art. As indicated by standard references, a simple estimate of the T_(m) value may be calculated by the equation: T_(m)=81.5+0.41(% G+C), when a nucleic acid is in aqueous solution at 1 M NaCl (See e.g., Anderson and Young, Quantitative Filter Hybridization, in Nucleic Acid Hybridization [1985]). Other references include more sophisticated computations that take structural as well as sequence characteristics into account for the calculation of T_(m).

As used herein the term “stringency” is used in reference to the conditions of temperature, ionic strength, and the presence of other compounds such as organic solvents, under which nucleic acid hybridizations are conducted. With “high stringency” conditions, nucleic acid base pairing will occur only between nucleic acid fragments that have a high frequency of complementary base sequences. Thus, conditions of “weak” or “low” stringency are often required with nucleic acids that are derived from organisms that are genetically diverse, as the frequency of complementary sequences is usually less.

“High stringency conditions” when used in reference to nucleic acid hybridization comprise conditions equivalent to binding or hybridization at 42° C. in a solution consisting of 5×SSPE (43.8 g/l NaCl, 6.9 g/l NaH₂PO₄.H₂O and 1.85 g/l EDTA, pH adjusted to 7.4 with NaOH), 0.5% SDS, 5×Denhardt's reagent and 100 μg/ml denatured salmon sperm DNA followed by washing in a solution comprising 0.1×SSPE, 1.0% SDS at 42° C. when a probe of about 500 nucleotides in length is employed.

“Medium stringency conditions” when used in reference to nucleic acid hybridization comprise conditions equivalent to binding or hybridization at 42° C. in a solution consisting of 5×SSPE (43.8 g/l NaCl, 6.9 g/l NaH₂PO₄.H₂O and 1.85 g/l EDTA, pH adjusted to 7.4 with NaOH), 0.5% SDS, 5×Denhardt's reagent and 100 μg/ml denatured salmon sperm DNA followed by washing in a solution comprising 1.0×SSPE, 1.0% SDS at 42° C. when a probe of about 500 nucleotides in length is employed.

“Low stringency conditions” comprise conditions equivalent to binding or hybridization at 42° C. in a solution consisting of 5×SSPE (43.8 g/l NaCl, 6.9 g/l NaH₂PO₄.H₂O and 1.85 g/l EDTA, pH adjusted to 7.4 with NaOH), 0.1% SDS, 5×Denhardt's reagent [50×Denhardt's contains per 500 ml: 5 g Ficoll (Type 400, Pharamcia), 5 g BSA (Fraction V; Sigma)] and 100 μg/ml denatured salmon sperm DNA followed by washing in a solution comprising 5×SSPE, 0.1% SDS at 42° C. when a probe of about 500 nucleotides in length is employed.

The terms “in operable combination,” “in operable order,” and “operably linked” as used herein refer to the linkage of nucleic acid sequences in such a manner that a nucleic acid molecule capable of directing the transcription of a given gene and/or the synthesis of a desired protein molecule is produced. The term also refers to the linkage of amino acid sequences in such a manner so that a functional protein is produced.

As used herein, the term “selectable marker” refers to a gene that encodes an enzymatic activity that confers the ability to grow in medium lacking what would otherwise be an essential nutrient (e.g. the HIS3 gene in yeast cells); in addition, a selectable marker may confer resistance to an antibiotic or drug upon the cell in which the selectable marker is expressed. Selectable markers may be “dominant”; a dominant selectable marker encodes an enzymatic activity that can be detected in any eukaryotic cell line. Examples of dominant selectable markers include the bacterial aminoglycoside 3′ phosphotransferase gene (also referred to as the neo gene) that confers resistance to the drug G418 in mammalian cells, the bacterial hygromycin G phosphotransferase (hyg) gene that confers resistance to the antibiotic hygromycin and the bacterial xanthine-guanine phosphoribosyl transferase gene (also referred to as the gpt gene) that confers the ability to grow in the presence of mycophenolic acid. Other selectable markers are not dominant in that their use must be in conjunction with a cell line that lacks the relevant enzyme activity. Examples of non-dominant selectable markers include the thymidine kinase (tk) gene that is used in conjunction with tk^(k) cell lines, the CAD gene, which is used in conjunction with CAD-deficient cells, and the mammalian hypoxanthine-guanine phosphoribosyl transferase (hprt) gene, which is used in conjunction with hprt* cell lines. A review of the use of selectable markers in mammalian cell lines is provided in Sambrook, J. et al., Molecular Cloning: A Laboratory Manual, 2nd ed., Cold Spring Harbor Laboratory Press, New York (1989) pp. 16.9-16.15.

As used herein, the term “regulatory element” refers to a genetic element that controls some aspect of the expression of nucleic acid sequences. For example, a promoter is a regulatory element that facilitates the initiation of transcription of an operably linked coding region. Other regulatory elements are splicing signals, polyadenylation signals, termination signals, RNA export elements, internal ribosome entry sites, etc. (defined infra).

Transcriptional control signals in eukaryotes comprise “promoter” and “enhancer” elements. Promoters and enhancers consist of short arrays of DNA sequences that interact specifically with cellular proteins involved in transcription (Maniatis et al., Science 236:1237 [1987]). Promoter and enhancer elements have been isolated from a variety of eukaryotic sources including genes in yeast, insect and mammalian cells, and viruses (analogous control elements, i.e., promoters, are also found in prokaryotes). The selection of a particular promoter and enhancer depends on what cell type is to be used to express the protein of interest. Some eukaryotic promoters and enhancers have a broad host range while others are functional in a limited subset of cell types (for review see, Voss et al., Trends Biochem. Sci., 1 1:287 [1986]; and Maniatis et al., supra). For example, the SV40 early gene enhancer is very active in a wide variety of cell types from many mammalian species and has been widely used for the expression of proteins in mammalian cells (Dijkema e al., EMBO J. 4:761 [1985]). Two other examples of promoter/enhancer elements active in a broad range of mammalian cell types are those from the human elongation factor 1α gene (Uetsuki et al., J. Biol. Chem., 264:5791 [1989]; Kim et al., Gene 91:217 [1990]; and Mizushima and Nagata, Nuc. Acids. Res., 18:5322 [1990]) and the long terminal repeats of the Rous sarcoma virus (Gorman t al., Proc. Natl. Acad. Sci. USA 79:6777 [1982]) and the human cytomegalovirus (Boshart et al., Cell 41:521 [1985]).

As used herein, the term “promoter/enhancer” denotes a segment of DNA which contains sequences capable of providing both promoter and enhancer functions (i.e., the functions provided by a promoter element and an enhancer element, see above for a discussion of these functions). For example, the long terminal repeats of retroviruses contain both promoter and enhancer functions. The enhancer/promoter may be “endogenous” or “exogenous” or “heterologous.” An “endogenous” enhancer/promoter is one that is naturally linked with a given gene in the genome. An “exogenous” or “heterologous” enhancer/promoter is one that is placed in juxtaposition to a gene by means of genetic manipulation (i.e., molecular biological techniques such as cloning and recombination) such that transcription of that gene is directed by the linked enhancer/promoter.

Regulatory elements may be tissue specific or cell specific. The term “tissue specific” as it applies to a regulatory element refers to a regulatory element that is capable of directing selective expression of a nucleotide sequence of interest to a specific type of tissue (e.g., liver) in the relative absence of expression of the same nucleotide sequence of interest in a different type of tissue (e.g., lung).

Tissue specificity of a regulatory element may be evaluated by, for example, operably linking a reporter gene to a promoter sequence (which is not tissue-specific) and to the regulatory element to generate a reporter construct, introducing the reporter construct into the genome of an animal such that the reporter construct is integrated into every tissue of the resulting transgenic animal, and detecting the expression of the reporter gene (e.g., detecting mRNA, protein, or the activity of a protein encoded by the reporter gene) in different tissues of the transgenic animal. The detection of a greater level of expression of the reporter gene in one or more tissues relative to the level of expression of the reporter gene in other tissues shows that the regulatory element is “specific” for the tissues in which greater levels of expression are detected. Thus, the term “tissue-specific” (e.g., liver-specific) as used herein is a relative term that does not require absolute specificity of expression. In other words, the term “tissue-specific” does not require that one tissue have extremely high levels of expression and another tissue have no expression. It is sufficient that expression is greater in one tissue than another. By contrast, “strict” or “absolute” tissue-specific expression is meant to indicate expression in a single tissue type (e.g., liver) with no detectable expression in other tissues.

The term “cell type specific” as applied to a regulatory element refers to a regulatory element that is capable of directing selective expression of a nucleotide sequence of interest in a specific type of cell in the relative absence of expression of the same nucleotide sequence of interest in a different type of cell within the same tissue. The term “cell type specific” when applied to a regulatory element also means a regulatory element capable of promoting selective expression of a nucleotide sequence of interest in a region within a single tissue.

Cell type specificity of a regulatory element may be assessed using methods well known in the art (e.g., immunohistochemical staining and/or Northern blot analysis). Briefly, for immunohistochemical staining, tissue sections are embedded in paraffin, and paraffin sections are reacted with a primary antibody specific for the polypeptide product encoded by the nucleotide sequence of interest whose expression is regulated by the regulatory element. A labeled (e.g., peroxidase conjugated) secondary antibody specific for the primary antibody is allowed to bind to the sectioned tissue and specific binding detected (e.g., with avidin/biotin) by microscopy. Briefly, for Northern blot analysis, RNA is isolated from cells and electrophoresed on agarose gels to fractionate the RNA according to size followed by transfer of the RNA from the gel to a solid support (e.g., nitrocellulose or a nylon membrane). The immobilized RNA is then probed with a labeled oligo-deoxyribonucleotide probe or DNA probe to detect RNA species complementary to the probe used. Northern blots are a standard tool of molecular biologists.

The term “promoter,” “promoter element,” or “promoter sequence” as used herein, refers to a DNA sequence which when ligated to a nucleotide sequence of interest is capable of controlling the transcription of the nucleotide sequence of interest into mRNA. A promoter is typically, though not necessarily, located 5′ (i.e., upstream) of a nucleotide sequence of interest whose transcription into mRNA it controls, and provides a site for specific binding by RNA polymerase and other transcription factors for initiation of transcription.

Promoters may be constitutive or regulatable. The term “constitutive” when made in reference to a promoter means that the promoter is capable of directing transcription of an operably linked nucleic acid sequence in the absence of a stimulus (e.g., heat shock, chemicals, etc.). In contrast, a “regulatable” promoter is one that is capable of directing a level of transcription of an operably linked nucleic acid sequence in the presence of a stimulus (e.g., heat shock, chemicals, etc.), which is different from the level of transcription of the operably linked nucleic acid sequence in the absence of the stimulus.

As used herein, the term “nucleic acid binding protein” refers to proteins that bind to nucleic acid, and in particular to proteins that cause increased (i.e., activators or transcription factors) or decreased (i.e., inhibitors, repressors) transcription from a gene.

The presence of“splicing signals” on an expression vector often results in higher levels of expression of the recombinant transcript. Splicing signals mediate the removal of introns from the primary RNA transcript and consist of a splice donor and acceptor site (Sambrook et al., Molecular Cloning: A Laboratory Manual, 2nd ed., Cold Spring Harbor Laboratory Press, New York [1989], pp. 16.7-16.8). A commonly used splice donor and acceptor site is the splice junction from the 16S RNA of SV40.

Efficient expression of recombinant DNA sequences in eukaryotic cells requires expression of signals directing the efficient termination and polyadenylation of the resulting transcript. Transcription termination signals are generally found downstream of the polyadenylation signal and are a few hundred nucleotides in length. The term “poly A site” or “poly A sequence” as used herein denotes a DNA sequence that directs both the termination and polyadenylation of the nascent RNA transcript. Efficient polyadenylation of the recombinant transcript is desirable as transcripts lacking a poly A tail are unstable and are rapidly degraded. The poly A signal utilized in an expression vector may be “heterologous” or “endogenous.” An endogenous poly A signal is one that is found naturally at the 3′ end of the coding region of a given gene in the genome. A heterologous poly A signal is one that is isolated from one gene and placed 3′ of another gene. A commonly used heterologous poly A signal is the SV40 poly A signal. The SV40 poly A signal is contained on a 237 bp BamHI/BclI restriction fragment and directs both termination and polyadenylation (Sambrook, supra, at 16.6-16.7).

Eukaryotic expression vectors may also contain “viral replicons” or “viral origins of replication.” Viral replicons are viral DNA sequences that allow for the extrachromosomal replication of a vector in a host cell expressing the appropriate replication factors. Vectors that contain either the SV40 or polyoma virus origin of replication replicate to high “copy number” (up to 10⁴ copies/cell) in cells that express the appropriate viral T antigen. Vectors that contain the replicons from bovine papillomavirus or Epstein-Barr virus replicate extrachromosomally at “low copy number” (˜100 copies/cell). However, it is not intended that expression vectors be limited to any particular viral origin of replication.

As used herein, the term “long terminal repeat” of “LTR” refers to transcriptional control elements located in or isolated from the U3 region 5′ and 3′ of a retroviral genome. As is known in the art, long terminal repeats may be used as control elements in retroviral vectors, or isolated from the retroviral genome and used to control expression from other types of vectors.

As used herein, the terms “RNA export element” or “Pre-mRNA Processing Enhancer (PPE)” refer to 3′ and 5′ cis-acting post-transcriptional regulatory elements that enhance export of RNA from the nucleus. “PPE” elements include, but are not limited to Mertz sequences (described in U.S. Pat. Nos. 5,914,267 and 5,686,120, all of which are incorporated herein by reference) and woodchuck mRNA processing enhancer (WPRE; WO99/14310 and U.S. Pat. No. 6,136,597, each of which is incorporated herein by reference).

As used herein, the term “polycistronic” refers to an mRNA encoding more than polypeptide chain (See, e.g., WO 93/03143, WO 88105486, and European Pat. No. 117058, all of which are incorporated herein by reference). Likewise, the term “arranged in polycistronic sequence” refers to the arrangement of genes encoding two different polypeptide chains in a single mRNA.

As used herein, the term “internal ribosome entry site” or “IRES” refers to a sequence located between polycistronic genes that permits the production of the expression product originating from the second gene by internal initiation of the translation of the dicistronic mRNA. Examples of internal ribosome entry sites include, but are not limited to, those derived from foot and mouth disease virus (FDV), encephalomyocarditis virus, poliovirus and RDV (Scheper et al., Biochem. 76: 801-809 [(1994]; Meyer et al., J. Virol. 69: 2819-2824 [1995]; Jang et al., 1988, J. Virol. 62: 2636-2643 [1998]; Haller et al., J. Virol. 66: 5075-5086 [1995]). Vectors incorporating IRES's may be assembled as is known in the art. For example, a retroviral vector containing a polycistronic sequence may contain the following elements in operable association: nucleotide polylinker, gene of interest, an internal ribosome entry site and a mammalian selectable marker or another gene of interest. The polycistronic cassette is situated within the retroviral vector between the 5′ LTR and the 3′ LTR at a position such that transcription from the 5′ LTR promoter transcribes the polycistronic message cassette. The transcription of the polycistronic message cassette may also be driven by an internal promoter (e.g., cytomegalovirus promoter) or an inducible promoter, which may be preferable depending on the use. The polycistronic message cassette can further comprise a cDNA or genomic DNA (gDNA) sequence operatively associated within the polylinker. Any mammalian selectable marker can be utilized as the polycistronic message cassette mammalian selectable marker. Such mammalian selectable markers are well known to those of skill in the art and can include, but are not limited to, kanamycin/G418, hygromycin B or mycophenolic acid resistance markers.

As used herein, the term “retrovirus” refers to a retroviral particle which is capable of entering a cell (i.e., the particle contains a membrane-associated protein such as an envelope protein or a viral G glycoprotein which can bind to the host cell surface and facilitate entry of the viral particle into the cytoplasm of the host cell) and integrating the retroviral genome (as a double-stranded provirus) into the genome of the host cell. The term “retrovirus” encompasses Oncovirinae (e.g., Moloney murine leukemia virus (MoMOLV). Moloney murine sarcoma virus (MoMSV), and Mouse mammary tumor virus (MMTV), Spumavirinae, amd Lentivirinae (e.g., Human immunodeficiency virus, Simian immunodeficiency virus, Equine infection anemia virus, and Caprine arthritis-encephalitis virus; See. e.g., U.S. Pat. Nos. 5,994,136 and 6,013,516, both of which are incorporated herein by reference).

As used herein, the term “retroviral vector” refers to a retrovirus that has been modified to express a gene of interest. Retroviral vectors can be used to transfer genes efficiently into host cells by exploiting the viral infectious process. Foreign or heterologous genes cloned (i.e., inserted using molecular biological techniques) into the retroviral genome can be delivered efficiently to host cells that are susceptible to infection by the retrovirus. Through well known genetic manipulations, the replicative capacity of the retroviral genome can be destroyed. The resulting replication-defective vectors can be used to introduce new genetic material to a cell but they are unable to replicate. A helper virus or packaging cell line can be used to permit vector particle assembly and egress from the cell. Such retroviral vectors comprise a replication-deficient retroviral genome containing a nucleic acid sequence encoding at least one gene of interest (i.e., a polycistronic nucleic acid sequence can encode more than one gene of interest), a 5′ retroviral long terminal repeat (5′ LTR); and a 3′ retroviral long terminal repeat (3′ LTR).

The term “pseudotyped retroviral vector” refers to a retroviral vector containing a heterologous membrane protein. The term “membrane-associated protein” refers to a protein (e.g., a viral envelope glycoprotein or the G proteins of viruses in the Rhabdoviridae family such as VSV, Piry, Chandipura and Mokola) that are associated with the membrane surrounding a viral particle; these membrane-associated proteins mediate the entry of the viral particle into the host cell. The membrane associated protein may bind to specific cell surface protein receptors, as is the case for retroviral envelope proteins or the membrane-associated protein may interact with a phospholipid component of the plasma membrane of the host cell, as is the case for the G proteins derived from members of the Rhabdoviridae family.

The term “heterologous membrane-associated protein” refers to a membrane-associated protein which is derived from a virus that is not a member of the same viral class or family as that from which the nucleocapsid protein of the vector particle is derived. “Viral class or family” refers to the taxonomic rank of class or family, as assigned by the International Committee on Taxonomy of Viruses.

The term “Rhabdoviridae” refers to a family of enveloped RNA viruses that infect animals, including humans, and plants. The Rhabdoviridae family encompasses the genus Vesiculovirus that includes vesicular stomatitis virus (VSV), Cocal virus, Piry virus, Chandipura virus, and Spring viremia of carp virus (sequences encoding the Spring viremia of carp virus are available under GenBank accession number U18101). The G proteins of viruses in the Vesiculovirus genera are virally-encoded integral membrane proteins that form externally projecting homotrimeric spike glycoproteins complexes that are required for receptor binding and membrane fusion. The G proteins of viruses in the Vesiculovirus genera have a covalently bound palmititic acid (C₁₆) moiety. The amino acid sequences of the G proteins from the Vesiculoviruses are fairly well conserved. For example, the Piry virus G protein share about 38% identity and about 55% similarity with the VSV G proteins (several strains of VSV are known, e.g., Indiana, N.J., Orsay, San Juan, etc., and their G proteins are highly homologous). The Chandipura virus G protein and the VSV G proteins share about 37% identity and 52% similarity. Given the high degree of conservation (amino acid sequence) and the related functional characteristics (e.g., binding of the virus to the host cell and fusion of membranes, including syncytia formation) of the G proteins of the Vesiculoviruses, the G proteins from non-VSV Vesiculoviruses may be used in place of the VSV G protein for the pseudotyping of viral particles. The G proteins of the Lyssa viruses (another genera within the Rhabdoviridae family) also share a fair degree of conservation with the VSV G proteins and function in a similar manner (e.g., mediate fusion of membranes) and therefore may be used in place of the VSV G protein for the pseudotyping of viral particles. The Lyssa viruses include the Mokola virus and the Rabies viruses (several strains of Rabies virus are known and their G proteins have been cloned and sequenced). The Mokola virus G protein shares stretches of homology (particularly over the extracellular and transmembrane domains) with the VSV G proteins which show about 31% identity and 48% similarity with the VSV G proteins. Preferred G proteins share at least 25% identity, preferably at least 30% P, identity and most preferably at least 35% identity with the VSV G proteins. The VSV G protein from which New Jersey strain (the sequence of this G protein is provided in GenBank accession numbers M27165 and M21557) is employed as the reference VSV G protein.

As used herein, the term “lentivirus vector” refers to retroviral vectors derived from the Lentiviridae family (e.g., human immunodeficiency virus, simian immunodeficiency virus, equine infectious anemia virus, and caprine arthritis-encephalitis virus) that are capable of integrating into non-dividing cells (See. e.g., U.S. Pat. Nos. 5,994,136 and 6,013,516, both of which are incorporated herein by reference).

The term “pseudotyped lentivirus vector” refers to lentivirus vector containing a heterologous membrane protein (e.g., a viral envelope glycoprotein or the G proteins of viruses in the Rhabdoviridae family such as VSV, Piry, Chandipura and Mokola).

As used herein, the term “transposon” refers to transposable elements (e.g., Tn5, Tn7, and Tn10) that can move or transpose from one position to another in a genome. In general, the transposition is controlled by a transposase. The term “transposon vector,” as used herein, refers to a vector encoding a nucleic acid of interest flanked by the terminal ends of transposon. Examples of transposon vectors include, but are not limited to, those described in U.S. Pat. Nos. 6,027,722; 5,958,775; 5,968,785; 5,965,443; and 5,719,055, all of which are incorporated herein by reference.

As used herein, the term “adeno-associated virus (AAV) vector” refers to a vector derived from an adeno-associated virus serotype, including without limitation, AAV-1, AAV-2, AAV-3, AAV-4, AAV-5, AAVX7, etc. AAV vectors can have one or more of the AAV wild-type genes deleted in whole or part, preferably the rep and/or cap genes, but retain functional flanking ITR sequences.

AAV vectors can be constructed using recombinant techniques that are known in the art to include one or more heterologous nucleotide sequences flanked on both ends (5′ and 3′) with functional AAV ITRs. In the practice of the invention, an AAV vector can include at least one AAV ITR and a suitable promoter sequence positioned upstream of the heterologous nucleotide sequence and at least one AAV ITR positioned downstream of the heterologous sequence. A “recombinant AAV vector plasmid” refers to one type of recombinant AAV vector wherein the vector comprises a plasmid. As with AAV vectors in general, 5′ and 3′ ITRs flank the selected heterologous nucleotide sequence.

AAV vectors can also include transcription sequences such as polyadenylation sites, as well as selectable markers or reporter genes, enhancer sequences, and other control elements that allow for the induction of transcription. Such control elements are described above.

As used herein, the term “AAV virion” refers to a complete virus particle. An AAV virion may be a wild type AAV virus particle (comprising a linear, single-stranded AAV nucleic acid genome associated with an AAV capsid, i.e., a protein coat), or a recombinant AAV virus particle (described below). In this regard, single-stranded AAV nucleic acid molecules (either the sense/coding strand or the antisense/anticoding strand as those terms are generally defined) can be packaged into an AAV virion; both the sense and the antisense strands are equally infectious.

As used herein, the term “recombinant AAV virion” or “rAAV” is defined as an infectious, replication-defective virus composed of an AAV protein shell encapsidating (i.e., surrounding with a protein coat) a heterologous nucleotide sequence, which in turn is flanked 5′ and 3′ by AAV ITRs. A number of techniques for constructing recombinant AAV virions are known in the art (See, e.g., U.S. Pat. No. 5,173,414; WO 9201070; WO 93/03769; Lebkowski et al., Molec. Cell. Biol. 8:3988-3996 [1988]; Vincent et al., Vaccines 90 [1990](Cold Spring Harbor Laboratory Press); Carter, Current Opinion in Biotechnology 3:533-539 [1992]; Muzyczka, Current Topics in Microbiol. and Immunol. 158:97-129 [1992]; Kotin, Human Gene Therapy 5:793-801 [1994]; Shelling and Smith, Gene Therapy 1:165-169 [1994]; and Zhou et al., J. Exp. Med. 179:1867-1875 [1994], all of which are incorporated herein by reference).

Suitable nucleotide sequences for use in AAV vectors (and, indeed, any of the vectors described herein) include any functionally relevant nucleotide sequence. Thus, the AAV vectors of the present invention can comprise any desired gene that encodes a protein that is defective or missing from a target cell genome or that encodes a non-native protein having a desired biological or therapeutic effect (e.g., an antiviral function), or the sequence can correspond to a molecule having an antisense or ribozyme function. Suitable genes include those used for the treatment of inflammatory diseases, autoimmune, chronic and infectious diseases, including such disorders as AIDS, cancer, neurological diseases, cardiovascular disease, hypercholestemia; various blood disorders including various anemias, thalasemias and hemophilia; genetic defects such as cystic fibrosis, Gaucher's Disease, adenosine deaminase (ADA) deficiency, emphysema, etc. A number of antisense oligonucleotides (e.g., short oligonucleotides complementary to sequences around the translational initiation site (AUG codon) of an mRNA) that are useful in antisense therapy for cancer and for viral diseases have been described in the art. (See. e.g., Han et al., Proc. Natl. Acad. Sci. USA 88:4313-4317 [1991]; Uhlmann et al., Chem. Rev. 90:543-584 [1990]; Helene et al., Biochim. Biophys. Acta. 1049:99-125 [1990]; Agarwal et al., Proc. Natl. Acad. Sci. USA 85:7079-7083 [1989]; and Heikkila et al., Nature 328:445-449 [1987]). For a discussion of suitable ribozymes, see, e.g., Cech et al. (1992) J. Biol. Chem. 267:17479-17482 and U.S. Pat. No. 5,225,347, incorporated herein by reference.

By “adeno-associated virus inverted terminal repeats” or “AAV ITRs” is meant the art-recognized palindromic regions found at each end of the AAV genome which function together in cis as origins of DNA replication and as packaging signals for the virus. For use with the present invention, flanking AAV ITRs are positioned 5′ and 3′ of one or more selected heterologous nucleotide sequences and, together with the rep coding region or the Rep expression product, provide for the integration of the selected sequences into the genome of a target cell.

The nucleotide sequences of AAV ITR regions are known (See, e.g., Kotin, Human Gene Therapy 5:793-801 [1994]; Berns, K. I. “Parvoviridae and their Replication” in Fundamental Virology, 2nd Edition, (B. N. Fields and D. M. Knipe, eds.) for the AAV-2 sequence. As used herein, an “AAV ITR” need not have the wild-type nucleotide sequence depicted, but may be altered, e.g., by the insertion, deletion or substitution of nucleotides. Additionally, the AAV ITR may be derived from any of several AAV serotypes, including without limitation, AAV-1, AAV-2, AAV-3, AAV-4, AAV-5, AAVX7, etc. The 5′ and 3′ ITRs which flank a selected heterologous nucleotide sequence need not necessarily be identical or derived from the same AAV serotype or isolate, so long as they function as intended, i.e., to allow for the integration of the associated heterologous sequence into the target cell genome when the rep gene is present (either on the same or on a different vector), or when the Rep expression product is present in the target cell.

As used herein the term, the term “in vitro” refers to an artificial environment and to processes or reactions that occur within an artificial environment. In vitro environments can consist of, but are not limited to, test tubes and cell cultures. The term “in vivo” refers to the natural environment (e.g., an animal or a cell) and to processes or reaction that occur within a natural environment.

As used herein, the term “passage” refers to the process of diluting a culture of cells that has grown to a particular density or confluency (e.g., 70% or 80% confluent), and then allowing the diluted cells to regrow to the particular density or confluency desired (e.g., by replating the cells or establishing a new roller bottle culture with the cells.

As used herein, the term “stable,” when used in reference to genome, refers to the stable maintenance of the information content of the genome from one generation to the next, or, in the particular case of a cell line, from one passage to the next. Accordingly, a genome is considered to be stable if no gross changes occur in the genome (e.g., a gene is deleted or a chromosomal translocation occurs). The term “stable” does not exclude subtle changes that may occur to the genome such as point mutations.

As used herein, the term “purified” refers to molecules, either nucleic or amino acid sequences, that are removed from their natural environment, isolated or separated. An “isolated nucleic acid sequence” is therefore a purified nucleic acid sequence. “Substantially purified” molecules are at least 60% free, preferably at least 75% free, and more preferably at least 90% free from other components with which they are naturally associated.

GENERAL DESCRIPTION OF THE INVENTION

The present invention uses the novel paradigm of utilizing exogenous cell-cell communication pathways to autoregulate cell proliferation and differentation. In some embodiments, the present invention provides a variety of cell-cell communication pathways and regulatory motifs that can be incorporated into any desired cell type.

In some preferred embodiments, the systems, compositions and methods of the present invention allow for tissue homeostasis to be achieved between a source of differentiable cells such as stem cells and target differentiated cells produced from the differentiable cells so that both cell populations are maintained. In some embodiments, the systems, compositions and methods control the proliferation and differentiation of the differentiable cell population so that a population of differentiable cells is maintained over time that can differentiate into the desired target differentiated cells. In preferred embodiments, the differentiable cells are subject to a regulatory feedback mechanism that controls a) the proliferation or inhibition of proliferation of differentiable cells into additional differentiable cells so that a source of differentiable cells is stably maintained and b) the differentiation of differentiable cells into target differentiated cells so that a stable population of target differentiated cells is maintained.

In some embodiments, the present invention utilizes a series of modules to control proliferation and differentiation of differentiable cells. In some embodiments, the modules each comprise one more genes. In further embodiments, the genes are provided on vectors that can be introduced into a desired cell line. In some embodiments, the first module comprises one or more nucleic acid constructs that interact to provide population control for a population of proliferating cells. In some embodiments, the proliferating cells are differentiable cells. In some embodiments, the second module comprises one or more nucleic acid constructs that interact to control commitment of the differentiable cells to differentiate. In some preferred embodiments, if the differentiable cell population is above a certain threshold and the target differentiated cells are below a certain threshold, some of the differentiable cells commit to differentiate into the target differentiated cells. In some embodiments, the third module comprises one or more nucleic acid constructs that interact to cause differentiation of differentiable cells into target differentiated cells. In some embodiments, the fourth module comprises one or more nucleic acid constructs that trigger apoptosis if the differentiable cell migrates out of a desired location.

Precise in vivo control of stem cell differentiation into desired cell types, such as insulin-producing β cells, the cell type that is adversely affected in Diabetes Mellitus, is achievable through use of the compositions and methods described herein. In some preferred embodiments, the approaches described herein ensure a constant and steady supply of precursor cells and β cells which autoregulate insulin production. This approach also has the potential to bypass graft vs. host disease by using naive embryonic stem cells [Burt et al., Journal of Experimental Medicine, 199(7):895-904, 2004] or the patient's own adult stem cells. Finally, the approach is modular, controllable and flexible, allowing us to genetically engineer pathways that best address a patient's disease state.

This approach represents a paradigm shift in tissue engineering and diabetes treatment. Artificial cell-cell communication coordinates cell population behavior and the formation of insulin-producing β cells by precisely controlling gene expression in a two-step differentiation process. In the systems of the present invention, cells are not simply induced exogenously to differentiate, but rather are programmed to sense and respond to changes in their environment and coordinate their collective behavior based on the needs of the organism. It is important to initially build a large, undifferentiated reservoir of cells. Quorum sensing allows for controlled growth of mES cells until they reach the required density before they are directed to differentiate. Once mES cells have terminally differentiated into beta cells, they either stop dividing or divide at a very slow rate. Importantly, once these beta cells lose function or die due for example to an attack by the immune system, the ES cell reservoir detects this condition and produces new beta cells.

DETAILED DESCRIPTION OF THE INVENTION

The present invention utilizes artificial cell-cell communication pathways to control proliferation and differentiation of a population of differentiable cells. In some embodiments, multiple exogenous cell-cell communication pathways are introduced into a population of differentiable cells to provide a regulatory feedback control system. A schematic representation of an exemplary system is provided in FIG. 3. FIG. 3 depicts a modular system for control of proliferation and differentiation of cells. In preferred embodiments, the cells of the present invention comprise one or more of the depicted modules or components of the depicted modules. The depicted modules comprise stem cell population control modules, commitment to differentiation modules, cell differentiation modules, and apoptosis modules. In some preferred embodiments, the cells of the present invention comprise at least one population control module and commitment module that interface with a differentiation module and an apoptosis module. In some preferred embodiments, the cell population control module and commitment module interface to control the number of uncommitted cells in the population (i.e., cells that retain the ability to differentiate) and the number of committed cells in the population (i.e., cells that have committed to differentiate, but have not necessarily differentiated into the desired target cell type).

In preferred embodiments, the modules of the present invention comprise separate motifs or circuit designs that interact to produce a desired outcome. In particularly preferred embodiments, cell-cell communication, oscillator, cascade, and/or toggle switch motifs are incorporated into one or more of the modules of the present invention. In preferred embodiments, these motifs interact to provide a symmetry breaking condition so that a population of stem cells is maintained and not exhausted. These individual components and their interaction with one another is described in detail below.

In some embodiments, the present invention provides artificial cell-cell communication systems for use in mammalian cells. In some embodiments, cell-cell communication systems are derived from bacterial cell-cell communication systems from organisms such as Pseudomnonas aeruginosa, Rhizobium leguminosarum, and Vibrio fischeri. These systems comprise genes that catalyze the synthesis of chemical signals of the acyl-homoserine lactone (AHL) family. In other embodiments, the cell-cell communication systems utilize secretion signals and viral internalization signals.

A. Population Control Module

As shown in FIG. 3, in some embodiments, the present invention provides a population control module comprising a pathway of exogenous genes that are preferably introduced into a population of stem cells. The population control module interacts with the commitment module to control the proliferation and differentiation of the cells within the system. The population control module is utilized to detect the number of stem cells in the system. If the stem cell population is low, then stem cell proliferation is allowed. If the stem cell population is high, then proliferation is inhibited. The commitment module (explained in more detail below) is used to detect the number of differentiated cells in the system (in this example, beta cells). If the beta cell population is high, proliferation of stem cells in inhibited. If the beta cell population is low, differentiation of stem cells is induced along with stem cell proliferation. A third module, described in more detail below, interacts with the two cell-cell communication modules to cause differentiation of stem cells into the desired target differentiated cell, in this example beta cells.

Referring to FIG. 3, in some embodiments, the population control module preferably comprises genes encoding a cell-cell communication pathway, preferably a RhlI/RhlR cell-cell communication pathway. The invention is not limited to the use of any particular cell-cell communication pathway in the first module. In other embodiments, LuxI/LuxR, CinI/CinR, or viral-based cell-cell communication systems are utilized. In preferred embodiments, bacterial cell-cell signaling pathways are utilized. As shown in FIG. 3, the population control module senses the density of the stem cell population. In some embodiments, the population control module comprises an exogenous RhlI gene operably linked to a repressor element, preferably a LacI repressor element. In the absence of the repressor, the exogenous RhiI gene catalyzes the synthesis of C₄HSL. The population control module further comprises an exogenous RhrR gene and an exogenous gene comprising a Growth Arrest Factor (GAF) operably linked to a promoter responsive to the C₄HSL/RhrR complex. Under conditions of a high density of stem cells, the corresponding high amount of C₄HSL interacts with the gene product of RhrR to form a complex, which activates expression of GAF, thus inhibiting proliferation of the stem cells.

Exemplary vectors for the components shown in FIG. 3 are provided in FIGS. 28A-28G-37A-37G, where A1=Gal4, R1=ZF1, R2=ZF2, R3=ZF3, R4=ZF4, A2=CymVP16, and P is either Hef1a or minCMV.

B. Commitment Module

Referring to FIG. 3, in some embodiments, the present invention provides a commitment module comprising multiple motifs or circuit elements that allow for symmetry breaking and control the commitment of cell to differentiate. In some embodiments, the module comprises one or more of the following motifs: an oscillator, a cascade, a toggle switch, and a cell-cell communication pathway. In some preferred embodiments, these motifs interact, as described in detail below, to allow commitment to differentiate when three conditions are met within the population of cells containing the modules: a) there is a high level of uncommitted cells; b) there are cells that allowed to commit; and c) there is a low level of committed cells.

Referring to FIG. 3, an embodiment of an oscillator is depicted by A1 and R1. In preferred embodiments, A1 is an activator and R1 is a repressor that are connected in way so that they oscillate. In some embodiments, as depicted, the activator A1 activates itself and the repressor R1 and the repressor R1 represses the activator A1. This oscillation provides symmetry breaking for the population of cells containing the motif. Due to the oscillation of activator A1 and repressor R1, the population of cells containing activator A1 and repressor R1 is asynchronous. At any given time, only a portion of the cells in the population will have a high level of activator A1. In some embodiments, as depicted, only cells with high activator A1 are allowed to commit to differentiate, thus the remaining portion of the cells with low levels of activator A1 is reserved in an uncommitted state. This ensures that a population of uncommitted stem cells will be maintained. Otherwise, all uncommitted stem cells could commit to differentiate, allowing the exhaustion of the stem cells from the population.

Referring to FIG. 3, an embodiment of a cascade is depicted by repressor R2, repressor R3, activator A2, and repressor R4. In some embodiments, activator A2 is operably linked to a promoter responsive to C₄HSL/RhrR and thus interfaces with the population control module. In some embodiments, as depicted, activator A1 from the oscillator activates repressor R2, which in turn represses repressor R3, which is a repressor of activator A2. Activator A2 interfaces with the population control module and is activated by the C₄HSL/RhrR complex. Thus, if the level of activator A1 is high, repressor R2 is activated which represses repressor R3, allowing the level of activator A2 to be high if there is a concomitant high concentration of uncommitted stem cells that are producing C₄HSL/RhrR. As a result, the level of activator A2 can only be high when two conditions are met: i) there is a high level of A1 within the cell; and 2) there is a high concentration of uncommitted stem cells. When these two conditions are met, activator A2 in turn activates repressor R4, which interfaces with the cell-cell communication and toggle switch motifs.

Still referring to FIG. 3, in order for cells within the population to commit as described above, a third condition must be met: a low concentration of committed cells. In some embodiments, the level of committed cells is detected via a cell-cell communication pathway, preferably a CinI/CinR cell-cell communication pathway. The invention is not limited to the use of any particular cell-cell communication pathway in the second module, although the particular cell-cell communication pathway utilized should preferably be different than the cell-cell communication pathway selected for the cell population module. In other embodiments, LuxI/LuxR, RhlI/RhlR, or viral-based cell-cell communication systems are utilized.

In some embodiments, as depicted in FIG. 3, the cell-cell communication pathway comprises an exogenous CinI gene, and exogenous CinR gene, and an exogenous gene encoding the repressor CI operably linked to a promoter responsive to the C₁₄HSL/CinR complex. High levels of C₁₄HSL are indicative of a high concentration of committed cells. Under these conditions, repressor CI is activated and in turn represses repressor R4. In some embodiments, repressor R4 interacts with the toggle switch motif as detailed below. When the level of repressor R4 is low, there is no commitment. As can be seen, the level of R4 can only be high when the level of activator A2 is high and the level of CI is low. Low levels of C₁₄HSL are indicative of a low concentration of committed cells and under these conditions, the level of repressor CI is low, allowing the level repressor R4 to be high.

Still referring to FIG. 3., in some embodiments, the commitment module further comprises a toggle switch motif that switches cells between an uncommitted state and a committed state. In some embodiments, as depicted in FIG. 3, the toggle switch motif comprises two repressors, in preferred embodiments, TetR and LacI. In some embodiments, TetR and LacI cross repress each other. High levels of TetR within a cell are indicative of an uncommitted state. As shown in FIG. 3, IPTG can be added to the cells to repress LacI, thus causing the cells to enter into an uncommitted state. Alternatively, as described above, repressor R4 is high in cells where activator A1 is high and where there is a high concentration of uncommitted stem cells. When these two conditions are met, along with the condition of low number of committed cells, repressor R4 represses TetR, allowing a high level of LacI. High levels of LacI in a cell trigger the transition from an uncommitted cell to a committed cell by repressing TetR. As depicted in FIG. 3, when LacI is high, repression of CinI is released, causing the synthesis of C₁₄HSL, the activation of CI and the repression of R4. These events are indicative of a committed cell. As also depicted in FIG. 3, high LacI within a cell represses RhlI of the cell population module, inhibiting synthesis of C₄HSL.

Exemplary vectors for the components shown in FIG. 3 are provided in FIGS. 28A-28G-37A-37G, where A1=Gal4, R1=ZF1, R2=ZF2, R3=ZF3, R4=ZF4, A2=CymVP16, and P is either Hef1a or minCMV.

C. Cell Differentiation Module

As shown in FIG. 3, in some embodiments, the present invention provides a cell differentiation module that is responsive to the commitment module. In some embodiments, the cell-differentiation module comprises at least an exogenous gene encoding a first cell fate regulator that is operably linked to a repressor, such as the TetR repressor. In some embodiments, as depicted in FIG. 3, the TetR/LacI toggle switch controls the expression of the cell fate regulator. When the three conditions of a) high level of uncommitted cells, b) cell that that is allowed to commit, and c) low level of committed cells are met, TetR is repressed, releasing the repression of the a cell fate regulator such as Gata4 or Sox17 (not shown). Gata4 expression is the initial step in the programmed two-step differentiation of ES cells into β cells using established cell fate regulators. In some preferred embodiments, two-step differentiation of ES cells first to visceral or definitive endoderm and then to pancreatic β cells is facilitated by using controlled expression of established cell fate regulators triggered by cell-cell communication. A preferred system for the integration of the relevant cell fate regulators with the cell population and commitment modules is shown in FIG. 3. When the three conditions are met, either Gata4 is expressed. These two cell fate regulators were chosen because they are both expressed in mammalian visceral endoderm [Ritz-Laser et al., Molecular Endocrinology, 19(3):759-770, 2005; Ku et al., Stem Cells, 22:1205-1217, 2004]. Both Gata4 and Sox17 are required transcription factors in pancreatic organogenesis. Gata4 is present in very early visceral endoderm that differentiates into both insulin-producing and glucagon-producing cells [42]. Sox17 is expressed exclusively in insulin-producing cells and appears slightly later than Gata4 [43]. In preferred embodiments, endoderm differentiation is verified by visualizing the presence of endodermal markers such as Hnf3β or lamininB1 using immunohistochemistry and Western blots. Cells that remain in the stem cell state are identified by staining with alkaline phosphatase in a standard ES cell assay.

In some preferred embodiments, the cell-cell communication systems of the present invention comprise vectors that express cell fate regulators that regulate the differentiation of endoderm to β cells. Differentiation into endoderm triggers the expression of endoderm specific factors. In preferred embodiments, the genes encoding cell fate regulators are operably linked to a promoter that is regulated (preferably induced) by one or more endoderm specific factors. It is contemplated that cells that differentiate into endoderm naturally express the endodermal marker α-fetoprotein (AFP) which binds the AFP promoter. In the third module of the present invention, AFP regulates expression of Ngn3, Pdx1 and/or other cell fate regulators such as EGFP via the AFP promoter, hence the differentiation from ES to endoderm is visualized with the appearance of green fluorescence. In vivo, Ngn3 and Pdx1 are both expressed at various points during cellular differentiation from endoderm to β cells [Soria, Differentiation, 68:205-219, 2001] and both have been found to be necessary for terminal differentiation into β cells. Naturally occurring in vivo specialization of endodermal cells into β cells depends on the existence of a complex set of environmental cues which may not be present in the disease state. In the engineered system of the present invention, however, differentiation of mES cells into endoderm internally forces the cell to subsequently specialize into β cells. Upon terminal β cell differentiation, insulin production activates DsRed expression from the Mouse Insulin Promoter (MIP). Cells expressing the fluorescent red protein are assayed for terminal differentiation by immunohistochemical assays and Western blotting of β cell markers such as C-peptide, insulin and Nkx6. 1. Cells not expressing fluorescent red protein are histochemically assayed for the presence of AFP, Pdx1, Ngn3 and also for stem cell character using the standard ES cell assay. Cells not initially induced with IPTG should not fluoresce red and should remain in the ES cell state.

In other preferred embodiments, other cell fate regulators are expressed via the differentiation module. Examples of the cell fate regulators include, but are not limited to, Nkx6.1, Nkx2.2, Isl-1, NeuroD, Pax6, Fgf4, BRA, Wnt9, NCAD, CER, FoxA2, CxcR4, Hnf1B, Hnf4A, Hnf6, HlxB9, Pax4, Cgc, GHRL, SST, PPY, Activin, Fgf10, Cyc, RA, Ex4, DAPT, HGF and Igf1. In preferred embodiments, these cell fate regulators are inserted into lentiviral vectors and introduced into the cell line of interest. In some preferred embodiments, the cell fate regulators are operably linked to a stage specific promoter. Stage specific promoters are promoters that are expressed at a particular stage of differentiation, such as differentiation into endoderm, ectoderm or mesoderm. In some preferred embodiments, the cell fate regulators are operably linked to an AFP promoter.

Exemplary vectors for the components shown in FIG. 3 are provided in FIGS. 28A-28G-37A-37G, where A1=Gal4, R1=ZF1, R2=ZF2, R3=ZF3, R4=ZF4, A2=CymVP16, and P is either Hef1a or minCMV.

D. Apoptosis Module

As shown in Figure, in some embodiments, the cells of the present invention further comprise an apoptosis module. In some embodiments, the apopstosis module comprises a series of vectors comprising genes that trigger apoptosis is the cell migrate to an undesired location. In preferred embodiments, the stem cells of the present invention would undergo apoptosis if the cells leave the pancreas. As shown in FIG. 3, genes encoding an apoptosis pathway are operably linked to a repressor element responsive to a repressor that is synthesized in the presence of a signal for the pancreas (PS). In the absence of PS, the repressor is not synthesized and apoptosis is triggered.

E. Cell Penetration Element Based Cell-Cell Communication Systems

In some embodiments, the present invention provides cell-cell communication systems that utilize cell penetrating polypeptide elements. In some embodiments, the present invention provides a fusion protein comprising a secretion signal, cell penetrating polypeptide, and trans acting domain in operable combination, wherein at least one of said secretion signal, cell penetrating polypeptide and trans acting domain are from at least two different proteins. In further embodiments, the present invention provides a nucleic acid encoding the fusion protein. In further preferred embodiments, the present invention provides vectors comprising a nucleic acid encoding the fusion protein. In still other embodiments, the present invention provides cells that express the fusion protein. In some embodiments, the present invention further provides a nucleic acid comprising a promoter comprising an element that binds or is responsive to the trans-acting domain, wherein the promoter is operably linked to a protein of interest.

The present invention is not limited to the use of any particular secretion signals, cell penetrating polypeptides or trans-acting domains. Indeed, the present invention contemplates a modular and highly expandable library of artificial cell-cell communication signals by using translational fusions of cell permeable peptides with in silico designed zinc finger proteins and an enhancer or repressor. Such fusion proteins are exported by sender cells, translocate to the nucleus of receiver cells, where they control expression of genes in synthetic as well as endogenous signaling pathways by binding their cognate DNA binding sites.

CPPs are peptide sequences with the ability to translocate across the plasma membrane and to reach cytoplasmic and/or nuclear compartments in live cells after internalization. CPPs have been first described in the HIV TAT-1 and the Antennapedia proteins, where the translocation seems to reflect in vivo biological process. In the last decade, fusion proteins containing such CPPs have been widely used to deliver effector proteins into the cytoplasm or nucleus of target cells, predominantly by adding the purified fusion protein to the cell culture supernatant or injecting it intraperitoneully in vivo. The binding of TAT to the cell surface thought to involve heparan sulfate proteoglycans, and in vitro evidences suggest that the uptake is mediated by an energy-depending endocytic process involving lipid rafts. The release from these endocytic vesicles into the cytoplasm is less well understood, and has been shown to be a rate-limiting step in the transduction process. For TAT, the subsequent localization to the nucleus is achieved through an effective nuclear localization sequence (NLS), endogenously present in this protein.

The present invention is not limited to the use any particular CPP peptide. The following CPP peptides find use in the present invention:

(SEQ ID NO: 24) TAT: SGYGRKKRRQRRRC (SED ID NO: 25) Antp: SGRQIKIWFQNRRMKWKKC (SEQ ID NO: 26) TAT: CRKKRRQRRRPPQG (SEQ ID NO: 27) TAT: (47-60), N-Cys-Tyr⁴⁷-Gly-Arg-Lys-Lys- Arg-Arg-Gln-Arg-Arg-Arg-Pro- Pro-Gln⁶⁰-COOH; (SEQ ID NO: 28) ANTP: N-Cys-Arg⁴³-Gln-Ile-Lys-Ile-Trp-Phe- Gln-Asn-Arg-Arg-Met-Lys-Trp- Lys-Lys⁵⁸-COOH (SEQ ID NO: 29) Vectocell ®: DPV3 sense, 5′ GATCCCGTAAAAAGCGTCGTC GAGAAAGCCGTAAGAAACGTCGACG TGAAAGCA-3′; (SEQ ID NO: 30) DPV3 antisense, 5′-AGCTTGCTTTCACGTCGACGTTTCTTA CGGCTTTCTCGACGACGCTTTTTACGG-3′; (SEQ ID NO: 31) DPV15b sense, 5′-GATCCGGTGCGTATGATCTGCGTCGTCG AGAACGTCAGAGCCGTCTGCGTCGACGTGAAAGACAGAGCAGAA-3′ (SEQ ID NO: 32) DPV3 antisense, 5′-AGCTTTCTGCTCTGTCTTTCACGTCGAC GCAGACGGCTCTGACGTTCTCGACGACGCAGATCATACGCAC CG-3′; (SEQ ID NO: 33) DPV1047 sense, 5′-GATCCGTTAAACGTGGACTGAAACTTC GTCATGTTCGTCCGCGTGTGACCCGTGATGTGA-3′; (SEQ ID NO: 34) DPV1047 antisense, 5′- AGCTTCACATCACATCACGGGTCAC ACGCGGACGAACATGACGAAGTTTCAGTCCACGTTTAACG-3′.

ZFs of the Cys2His2 type contain about 30 amino acids that code for two b-strands and an a-helix that mediates interaction with a nucleotide triplet. The human genome contains at least 4000 such domains in over 700 proteins, which represents about 2% of human genes. ZFs recognizing many of the 64 triplets possible have been isolated and described. Such a ZF domain can be treated as modular, which means that multiple concatenated ZF domains (polydactyl ZFs) do bind to DTS with a multiple of three triplets length. Recently, databases and tools became available to easily engineer ZFP-DTS pairs in silico to further streamline this process (16-18). The optimal linker sequences to connect such single ZF domains to a ZFP have also been described, as well as the optimal positioning of transcriptional activators or inhibitors.

A translational fusion of a CPP with a ZFP, when exported by a sender cell, translocates into the nucleus of receiver cells. The transcellular delivery of such ZFPs would massively expand the possibilities we have at our disposition to manipulate synthetic as well as endogenous signaling pathways in stem cells. This would facilitate the programmed differentiation of stem cells into tissue patterns by design. By engineering cells that can be manipulated to decide between cell fates based on the given signal and working within a 3D matrix system in vitro, we will study the potential of this system in promoting organogenesis and tissue repair. The long-term goal would be to have safe self-regulating and self-regenerating tissue/organs for clinical applications. Another purpose is to elucidate the architecture and dynamics of endogenous cell fate regulatory networks as they function to promote lineage choices.

An additional application for such an artificial cell-cell communication system would be the delivery of cell-permeable therapeutic proteins influencing cell signaling at the level of protein-protein interactions in a controlled fashion in vivo. As an example, the secretion of cell-permeable peptides of the wild-type tumor suppressor p53 from artificially engineered cells or tissue, would restore cell growth control of tumor cells with mutated forms of p53, which are present in almost all human cancers.

The present invention is likewise not limited to the use of any particular secretion signal. A secretion signal is any DNA sequence which when operably linked to a recombinant DNA sequence encodes a signal peptide which is capable of causing the secretion of the recombinant polypeptide. In general, the signal peptides comprise a series of about 15 to 30 hydrophobic amino acid residues (See, e.g., Zwizinski et al., J. Biol. Chem. 255(16): 7973-77 [1980], Gray et al., Gene 39(2): 247-54 [1985], and Martial et al., Science 205: 602-607 [1979]). Such secretion signal sequences are preferably derived from genes encoding polypeptides secreted from the cell type targeted for tissue specific expression. Secretory DNA sequences, however, are not limited to such sequences. Secretory DNA sequences from proteins secreted from many cell types and organisms may also be used (e.g., the secretion signals for t PA, serum albumin. lactoferrin, and growth hormone, and secretion signals from microbial genes encoding secreted polypeptides such as from yeast, filamentous fungi, and bacteria).

FIGS. 26A-26B and 27A-278 provide examples of a synthetic cell-cell communication circuit that utilizes CPP elements. Panel (26A and 27A) depicts internal signaling. TAT is expressed upon the addition of Dox, binds to is promoter pTAT, and induces expression of EGFP. Panel (26B and 26B) depicts cell-cell signaling. Sender and receiver circuits are infected into separate cells. Sender cells express TAT upon addition of Dox; TAT is secreted, and enters receiver cells, where is localizes to the nucleus and induces the expression of EGFP.

F. Vector Systems

In some embodiments, the various exogenous genes described above are provided on vectors that are introduced into the desired cell line, such as a stem cell line. In some preferred embodiments, the vectors are lentiviral vectors. In some embodiments, additional genes required for the synthesis of precursors of C₄HSL, C₁₄HSL, or 30C₆HSL are also introduced into the desired cell line. In some preferred embodiments, vectors encoding one or more genes from the Type II bacterial fatty acid synthesis (FAS) system are also introduced into mammalian cells. In some embodiments, the vectors encode Acyl Carrier Protein and Acyl-Acyl Carrier (ACP) Protein Synthase (AAS) including ACP and AAS.

In preferred embodiments, the components of the cell-cell communication system are included in separate vectors. The individual vectors are then introduced into the desired cell line. In preferred embodiments, where lentiviral vectors are utilized, the target cell line is transduced with the lentiviral vectors. The cell line may be co-infected with the vectors or the vectors may be introduced serially.

In some preferred embodiments, the genes for the components of the cell-cell communication system of the present invention (e.g., LuxI, ACP, and AAS) are optimized for expression in mammalian cells. In other embodiments, genes comprising the entire bacterial Type II Fatty Acid Synthase system are included on vectors and introduced in the cell line.

In preferred embodiments, as described in more detail below, the components of the cell-cell communication systems are introduced into a mammalian cell line. The present invention is not limited to the use of any particular cell lines. In some embodiments, the cell lines are pluripotent, multipotent or totipotent cell lines. In some preferred embodiments, the cells lines are stem cell lines. In preferred embodiments, the components are incorporated into expression vectors that are introduced into mammalian cells. In some preferred embodiments, the genes encoding the components of the cell-cell communication system are operably linked to exogenous promoters.

The modules and motifs of the present invention generally comprise multiple exogenous genes operably linked to promoters. In preferred embodiments, the exogenous genes operably linked to promoters are included in a vector. In preferred embodiments, the vectors are introduced into a cell. The present invention is not limited to the use of any particular vector system. Indeed, the use of a variety of vector systems is contemplated. In some preferred embodiments, the vectors are lentiviral vectors. In other embodiments, the vectors are retroviral vectors, pseudotyped retroviral vectors, pseudotyped lentiviral vectors, adenovirus vectors, plasmids, transposons, or artificial chromosomes.

The present invention also contemplates the use of lentiviral vectors to generate high copy number cell lines. The lentiviruses (e.g., equine infectious anemia virus, caprine arthritis-encephalitis virus, human immunodeficiency virus) are a subfamily of retroviruses that are able to integrate into non-dividing cells. The lentiviral genome and the proviral DNA have the three genes found in all retroviruses: gag, pol, and env, which are flanked by two LTR sequences. The gag gene encodes the internal structural proteins (e.g., matrix, capsid, and nucleocapsid proteins); the pol gene encodes the reverse transcriptase, protease, and integrase proteins; and the pol gene encodes the viral envelope glycoproteins. The 5′ and 3′ LTRs control transcription and polyadenylation of the viral RNAs. Additional genes in the lentiviral genome include the vif, vpr, tat, rev, vpu, nef, and vpx genes.

A variety of lentiviral vectors and packaging cell lines are known in the art and find use in the present invention (See, e.g., U.S. Pat. Nos. 5,994,136 and 6,013,516, both of which are herein incorporated by reference). Furthermore, the VSV G protein has also been used to pseudotype retroviral vectors based upon the human immunodeficiency virus (HIV) (Naldini et al., Science 272:263 [1996]). Thus, the VSV G protein may be used to generate a variety of pseudotyped retroviral vectors and is not limited to vectors based on MoMLV. The lentiviral vectors may also be modified as described above to contain various regulatory sequences (e.g., signal peptide sequences, RNA export elements, and IRES's). After the lentiviral vectors are produced, they may be used to transfect host cells as described above for retroviral vectors.

In general, retroviruses (family Retroviridae) are divided into three groups: the spumaviruses (e.g., human foamy virus); the lentiviruses (e.g., human immunodeficiency virus and sheep visna virus) and the oncoviruses (e.g., MLV, Rous sarcoma virus).

Retroviruses are enveloped (i.e., surrounded by a host cell-derived lipid bilayer membrane) single-stranded RNA viruses which infect animal cells. When a retrovirus infects a cell, its RNA genome is converted into a double-stranded linear DNA form (i.e., it is reverse transcribed). The DNA form of the virus is then integrated into the host cell genome as a provirus. The provirus serves as a template for the production of additional viral genomes and viral mRNAs. Mature viral particles containing two copies of genomic RNA bud from the surface of the infected cell. The viral particle comprises the genomic RNA, reverse transcriptase and other pol gene products inside the viral capsid (which contains the viral gag gene products), which is surrounded by a lipid bilayer membrane derived from the host cell containing the viral envelope glycoproteins (also referred to as membrane-associated proteins).

The organization of the genomes of numerous retroviruses is well known to the art and this has allowed the adaptation of the retroviral genome to produce retroviral vectors. The production of a recombinant retroviral vector carrying a gene of interest is typically achieved in two stages.

First, the gene of interest is inserted into a retroviral vector which contains the sequences necessary for the efficient expression of the gene of interest (including promoter and/or enhancer elements which may be provided by the viral long terminal repeats (LTRs) or by an internal promoter/enhancer and relevant splicing signals), sequences required for the efficient packaging of the viral RNA into infectious virions (e.g., the packaging signal (Psi), the tRNA primer binding site (−PBS), the 3′ regulatory sequences required for reverse transcription (+PBS)) and the viral LTRs. The LTRs contain sequences required for the association of viral genomic RNA, reverse transcriptase and integrase functions, and sequences involved in directing the expression of the genomic RNA to be packaged in viral particles. For safety reasons, many recombinant retroviral vectors lack functional copies of the genes that are essential for viral replication (these essential genes are either deleted or disabled); therefore, the resulting virus is said to be replication defective.

Second, following the construction of the recombinant vector, the vector DNA is introduced into a packaging cell line. Packaging cell lines provide proteins required in trans for the packaging of the viral genomic RNA into viral particles having the desired host range (i.e., the viral-encoded gag, pol and env proteins). The host range is controlled, in part, by the type of envelope gene product expressed on the surface of the viral particle. Packaging cell lines may express ecotrophic, amphotropic or xenotropic envelope gene products. Alternatively, the packaging cell line may lack sequences encoding a viral envelope (env) protein. In this case the packaging cell line will package the viral genome into particles that lack a membrane-associated protein (e.g., an env protein). In order to produce viral particles containing a membrane associated protein that will permit entry of the virus into a cell, the packaging cell line containing the retroviral sequences is transfected with sequences encoding a membrane-associated protein (e.g., the G protein of vesicular stomatitis virus (VSV)). The transfected packaging cell will then produce viral particles, which contain the membrane-associated protein expressed by the transfected packaging cell line; these viral particles, which contain viral genomic RNA derived from one virus encapsidated by the envelope proteins of another virus are said to be pseudotyped virus particles.

The retroviral vectors of the present invention can be further modified to include additional regulatory sequences. As described above, the retroviral vectors of the present invention include the following elements in operable association: a) a 5′ LTR; b) a packaging signal; c) a 3′ LTR and d) a nucleic acid encoding a protein of interest located between the 5′ and 3′ LTRs. In some embodiments, the nucleic acid of interest is operably linked to a promoter of interest. As described above, in preferred embodiments, the nucleic acid of interest is a gene encoding a protein from a quorum sensing system, or a gene encoding a cell fate regulator. In some preferred embodiments, the promoter is a promoter responsive to an autoinducer/regulatory partner complex sysnthesized by the quorum sensing pathway, an inducible promoter, a repressible promoter, or a stage specific promoter such as the AFP promoter. In some embodiments of the present invention, the nucleic acid of interest may be arranged in opposite orientation to the 5′ LTR when transcription from an internal promoter is desired.

In other embodiments of the present invention, the vectors are modified by incorporating an RNA export element (See, e.g., U.S. Pat. Nos. 5,914,267; 6,136,597, and 5,686,120; and WO99/14310, all of which are incorporated herein by reference) either 3′ or 5′ to the nucleic acid sequence encoding the protein of interest. It is contemplated that the use of RNA export elements allows high levels of expression of the protein of interest without incorporating splice signals or introns in the nucleic acid sequence encoding the protein of interest.

In still other embodiments, the vector further comprises at least one internal ribosome entry site (IRES) sequence. The sequences of several suitable IRES's are available, including, but not limited to, those derived from foot and mouth disease virus (FDV), encephalomyocarditis virus, and poliovirus. The IRES sequence can be interposed between two transcriptional units (e.g., nucleic acids encoding different proteins of interest or subunits of a multisubunit protein such as an antibody) to form a polycistronic sequence so that the two transcriptional units are transcribed from the same promoter.

The retroviral vectors of the present invention may also further comprise a selectable marker allowing selection of transformed cells. A number of selectable markers find use in the present invention, including, but not limited to the bacterial aminoglycoside 3′ phosphotransferase gene (also referred to as the neo gene) that confers resistance to the drug G418 in mammalian cells, the bacterial hygromycin G phosphotransferase (hyg) gene that confers resistance to the antibiotic hygromycin and the bacterial xanthine-guanine phosphoribosyl transferase gene (also referred to as the gpt gene) that confers the ability to grow in the presence of mycophenolic acid. In some embodiments, the selectable marker gene is provided as part of polycistronic sequence that also encodes the protein of interest.

Viral vectors, including recombinant lentiviral vectors, provide a more efficient means of transferring genes into cells as compared to other techniques such as calcium phosphate-DNA co-precipitation or DEAE-dextran-mediated transfection, electroporation or microinjection of nucleic acids. It is believed that the efficiency of viral transfer is due in part to the fact that the transfer of nucleic acid is a receptor-mediated process (i.e., the virus binds to a specific receptor protein on the surface of the cell to be infected). In addition, the virally transferred nucleic acid once inside a cell integrates in controlled manner in contrast to the integration of nucleic acids which are not virally transferred; nucleic acids transferred by other means such as calcium phosphate-DNA co-precipitation are subject to rearrangement and degradation.

The most commonly used recombinant retroviral vectors are derived from the amphotropic Moloney murine leukemia virus (MoMLV) (See e.g., Miller and Baltimore Mol. Cell. Biol. 6:2895 [1986]). The MoMLV system has several advantages: 1) this specific retrovirus can infect many different cell types, 2) established packaging cell lines are available for the production of recombinant MoMLV viral particles and 3) the transferred genes are permanently integrated into the target cell chromosome. The established MoMLV vector systems comprise a DNA vector containing a small portion of the retroviral sequence (e.g., the viral long terminal repeat or “LTR” and the packaging or “psi” signal) and a packaging cell line. The gene to be transferred is inserted into the DNA vector. The viral sequences present on the DNA vector provide the signals necessary for the insertion or packaging of the vector RNA into the viral particle and for the expression of the inserted gene. The packaging cell line provides the proteins required for particle assembly (Markowitz et al., J. Virol. 62:1120 [1988]).

In some preferred embodiments, the retroviral vector is pseudotyped. (See, e.g., U.S. Pat. No. 5,512,421, which is incorporated herein by reference). In some preferred embodiments, the pseudotyped retrovirus contains the G protein of VSV as the membrane associated protein. Unlike retroviral envelope proteins that bind to a specific cell surface protein receptor to gain entry into a cell, the VSV G protein interacts with a phospholipid component of the plasma membrane (Mastromarino et al., J. Gen. Virol. 68:2359 [1977]). Because entry of VSV into a cell is not dependent upon the presence of specific protein receptors, VSV has an extremely broad host range. Pseudotyped retroviral vectors bearing the VSV G protein have an altered host range characteristic of VSV (i.e., they can infect almost all species of vertebrate, invertebrate and insect cells). Importantly, VSV G-pseudotyped retroviral vectors can be concentrated 2000-fold or more by ultracentrifugation without significant loss of infectivity (Burns et al. Proc. Natl. Acad. Sci. USA 90:8033 [1993]).

The present invention also contemplates the use of adeno associated virus (AAV) vectors. AAV is a human DNA parvovirus, which belongs to the genus Dependovirus. The AAV genome is composed of a linear, single-stranded DNA molecule that contains approximately 4680 bases. The genome includes inverted terminal repeats (ITRs) at each end that function in cis as origins of DNA replication and as packaging signals for the virus. The internal nonrepeated portion of the genome includes two large open reading frames, known as the AAV rep and cap regions, respectively. These regions code for the viral proteins involved in replication and packaging of the virion. A family of at least four viral proteins are synthesized from the AAV rep region, Rep 78, Rep 68, Rep 52 and Rep 40, named according to their apparent molecular weight. The AAV cap region encodes at least three proteins, VP1, VP2 and VP3 (for a detailed description of the AAV genome, see e.g., Muzyczka, Current Topics Microbiol. Immunol. 158:97-129 [1992]; Kotin, Human Gene Therapy 5:793-801 [1994]).

AAV requires coinfection with an unrelated helper virus, such as adenovirus, a herpesvirus or vaccinia, in order for a productive infection to occur. In the absence of such coinfection, AAV establishes a latent state by insertion of its genome into a host cell chromosome. Subsequent infection by a helper virus rescues the integrated copy, which can then replicate to produce infectious viral progeny. Unlike the non-pseudotyped retroviruses, AAV has a wide host range and is able to replicate in cells from any species so long as there is coinfection with a helper virus that will also multiply in that species. Thus, for example, human AAV will replicate in canine cells coinfected with a canine adenovirus. Furthermore, unlike the retroviruses, AAV is not associated with any human or animal disease, does not appear to alter the biological properties of the host cell upon integration and is able to integrate into nondividing cells. It has also recently been found that AAV is capable of site-specific integration into a host cell genome.

In light of the above-described properties, a number of recombinant AAV vectors have been developed for gene delivery (See, e.g., U.S. Pat. Nos. 5,173,414; 5,139,941; WO 92/01070 and WO 9303769, both of which are incorporated herein by reference; Lebkowski et al., Molec. Cell. Biol. 8:3988-3996 [1988]; Carter, Current Opinion in Biotechnology 3:533-539 [1992]; Muzyczka, Current Topics in Microbiol. and Immunol. 158:97-129 [1992]; Kotin, (1994) Human Gene Therapy 5:793-801; Shelling and Smith, Gene Therapy 1:165-169 [1994]; and Zhou et al., J. Exp. Med. 179:1867-1875 [1994]).

Recombinant AAV virions can be produced in a suitable host cell that has been transfected with both an AAV helper plasmid and an AAV vector. An AAV helper plasmid generally includes AAV rep and cap coding regions, but lacks AAV ITRs. Accordingly, the helper plasmid can neither replicate nor package itself. An AAV vector generally includes a selected gene of interest bounded by AAV ITRs that provide for viral replication and packaging functions. Both the helper plasmid and the AAV vector bearing the selected gene are introduced into a suitable host cell by transient transfection. The transfected cell is then infected with a helper virus, such as an adenovirus, which transactivates the AAV promoters present on the helper plasmid that direct the transcription and translation of AAV rep and cap regions. Recombinant AAV virions harboring the selected gene are formed and can be purified from the preparation. Once the AAV vectors are produced, they may be used to transfect (See, e.g., U.S. Pat. No. 5,843,742, herein incorporated by reference) host cells at the desired multiplicity of infection to produce high copy number host cells. As will be understood by those skilled in the art, the AAV vectors may also be modified as described above to contain various regulatory sequences (e.g., signal peptide sequences, RNA export elements, and IRES's).

The present invention also contemplates the use of transposon vectors. Transposons are mobile genetic elements that can move or transpose from one location another in the genome. Transposition within the genome is controlled by a transposase enzyme that is encoded by the transposon. Many examples of transposons are known in the art, including, but not limited to, Tn5 (See e.g., de la Cruz et al., J. Bact. 175: 6932-38 [1993], Tn7 (See e.g., Craig, Curr. Topics Microbiol. Immunol. 204: 27-48 [1996]), and Tn10 (See e.g., Morisato and Kleckner, Cell 51:101-111 [1987]). The ability of transposons to integrate into genomes has been utilized to create transposon vectors (See, e.g., U.S. Pat. Nos. 5,719,055; 5,968,785; 5,958,775: and 6,027,722; all of which are incorporated herein by reference.) Because transposons are not infectious, transposon vectors are introduced into host cells via methods known in the art (e.g., electroporation, lipofection, or microinjection). Therefore, the ratio of transposon vectors to host cells may be adjusted to provide the desired multiplicity of infection to produce the high copy number host cells of the present invention.

Transposon vectors suitable for use in the present invention generally comprise a nucleic acid encoding a protein of interest interposed between two transposon insertion sequences. Some vectors also comprise a nucleic acid sequence encoding a transposase enzyme. In these vectors, the one of the insertion sequences is positioned between the transposase enzyme and the nucleic acid encoding the protein of interest so that it is not incorporated into the genome of the host cell during recombination. Alternatively, the transposase enzyme may be provided by a suitable method (e.g., lipofection or microinjection). As will be understood by those skilled in the art, the transposon vectors may also be modified as described above to contain various regulatory sequences (e.g., signal peptide sequences, RNA export elements, and IRES's).

In some preferred embodiments, the quorum sensing system of the present invention comprises one or more of the vectors described in FIGS. 5A-5D-25A-25G, and SEQ ID Nos:01-12. FIGS. 5A-5D and 6 provide a map and sequence (SEQ ID NO:01) for the LuxI vector-pLV-Hef1a-LuxIm-IRES2-DsRed2; FIGS. 7A-7E and 8 provide a map and sequence (SEQ ID NO:02) for the LuxR vector-pLV-Hef1a-p65H4LuxRFm-IRES2-DsRed2; FIGS. 9A-9E and 10 provide a map and sequence (SEQ ID NO:03) for the lux promoter vector for expression of Gata4/Sox17-pLV-minCMVLuxO7-IRES2-EGFP; FIGS. 11A-11E and 12 provide a map and sequence (SEQ ID NO:04) for the ACP vector-pLV-Hef1a-ACPm-IRES2-DsRed2; FIGS. 13A-13F and 14 provide a map and sequence (SEQ ID NO:05) for the AAS vector-pLV-Hef1a-AAS-IRES2-EGFP; FIGS. 15A-15E and 16 provide a map and sequence (SEQ ID NO:06) for the AFP promoter vector PDX1-pLV-AFP-Pdx1-IRES2-DsRed2; FIGS. 17A-17E and 18 provide a map and sequence (SEQ ID NO:07) for the AFP promoter vector Ngn3-pLV-AFP-Ngn3-IRES2-DsRed2; FIGS. 19A-19E and provide a map and sequence (SEQ ID NO:08) for the AFP promoter vector for TetRKRAB-pLV-AFP-TetRKRAB-IRES2-DsRed2; FIGS. 21A-21E and 22 provide a map and sequence (SEQ ID NO:09) for the RhlI vector-pLV-Hef1a-RhlI-IRES2-DsRed2.

It will be recognized that forgoing vectors are plasmid vectors utilized for the production of lentiviral vectors that are used to transducer target cells such as embryonic stems cells or adult stem cells. The quorum sensing pathway components are derived from bacterial genes. In preferred embodiments, the bacterial genes are codon optimized for expression in mammalian cells. It will also be recognized that the sequences of the components of the vectors may be varied. Accordingly, the present invention encompasses the use of vector components, including the genes of interest such as LuxI, CinI, RhiI, LuxR, CinR, RhlR, and any of the cell fate regulators that are at least 50%, 70%, 80%, 90%, or 95% identical to the wild type gene of interest and maintain the function of the gene of interest. Likewise, the present invention encompasses the use of promoters that that are at least 50%, 70%, 80%, 90%, or 95% identical to the promoter sequences described herein such as the lux promoter, AFP promoter, etc. In some preferred embodiments, the genes encoding LuxR, CinR, or RhlR are modified to include a mammalian activation domain. In preferred embodiments, the mammalian activation domain is the P65 mammalian activation domain. In preferred embodiments, the LuxR-, CinR-, or RhlI-regulator proteins are therefore a fusion with a mammalian activation domain. Accordingly, in some embodiments, the present invention provides vectors and systems comprising vectors that comprise a gene encoding a regulator protein-mammalian activation domain fusion protein.

In further embodiments, the present invention provides promoters that are inducible by a regulatory protein-autoinducer complex. In some embodiments, the promoter comprises at least one, and preferably 2, 3, 4, 5, 6, 7, 8, 9, or 10 sequences that bind the regulatory protein-autoinducer complex. In some embodiments, the promoter further comprises a minimal element from a mammalian promoter. In some embodiments, the minimal element is derived from the cytomegalovirus promoter, an example of such a promoter is provided in FIGS. 9A-9E and 10, SEQ ID NO:3.

Accordingly, in some preferred embodiments, the present invention provides vectors with the following components in operable association:

5′LTR-Promoter-Mammalian codon optimized LuxI-3′LTR

5′LTR-Promoter-Mammalian codon optimized LuxI-IRES-Reporter-3′LTR

5′LTR-Repressible promoter-Mammalian codon optimized LuxI-3′LTR

5′LTR-LacI promoter-Mammalian codon optimized LuxI-3′LTR

5′LTR-Promoter-Mammalian codon optimized RhlI-3′LTR

5′LTR-Promoter-Mammalian codon optimized RhlI-IRES-Reporter-3′LTR

5′LTR-Repressible promoter-Mammalian codon optimized RhlI-3′LTR

5′LTR-LacI promoter-Mammalian codon optimized RhlI-3′LTR

5′LTR-Promoter-Mammalian codon optimized CinI-3′ LTR

5′LTR-Promoter-Mammalian codon optimized CinI-IRES-Reporter-3′LTR

5′LTR-Repressible promoter-Mammalian codon optimized CinI-3′LTR

5′LTR-LacI promoter-Mammalian codon optimized CinI-3′ LTR

5′LTR-Promoter-Mammalian codon optimized LuxR-3′LTR

5′LTR-Promoter-Mammalian codon optimized LuxR-IRES-Reporter-3′LTR

5′LTR-Repressible promoter-Mammalian codon optimized LuxR-3′LTR

5′LTR-LacI promoter-Mammalian codon optimized LuxR-3′LTR

5′LTR-Promoter-Mammalian codon optimized RhlR-3′LTR

5′LTR-Promoter-Mammalian codon optimized RhlR-IRES-Reporter-3′LTR

5′LTR-Repressible promoter-Mammalian codon optimized RhlR-3′LTR

5′LTR-LacI promoter-Mammalian codon optimized RhlR-3′LTR

5′LTR-Promoter-Mammalian codon optimized CinR-3′LTR

5′LTR-Promoter-Mammalian codon optimized CinR-IRES-Reporter-3′LTR

5′LTR-Repressible promoter-Mammalian codon optimized CinR-3′LTR

5′LTR-LacI promoter-Mammalian codon optimized CinR-3′LTR

5′LTR-Promoter-Mammalian codon optimized LuxR P65 fusion-3′LTR

5′LTR-Promoter-Mammalian codon optimized LuxR P65 fusion-IRES-Reporter-3′LTR

5′LTR-Repressible promoter-Mammalian codon optimized LuxR P65 fusion-3′LTR

5′LTR-LacI promoter-Mammalian codon optimized LuxR P65 fusion-3′LTR

5′LTR-Promoter-Mammalian codon optimized RhlR P65 fusion-3′LTR

5′LTR-Promoter-Mammalian codon optimized RhlR P65 fusion-IRES-Reporter-3′LTR

5′LTR-Repressible promoter-Mammalian codon optimized RhlR P65 fusion-3′LTR

5′LTR-LacI promoter-Mammalian codon optimized RhlR P65 fusion-3′LTR

5′LTR-Promoter-Mammalian codon optimized CinR P65 fusion-3′LTR

5′LTR-Promoter-Mammalian codon optimized CinR P65 fusion-IRES-Reporter-3′LTR

5′LTR-Repressible promoter-Mammalian codon optimized CinR P65 fusion-3′LTR

5′LTR-LacI promoter-Mammalian codon optimized CinR P65 fusion-3′LTR

5′ LTR-regulatory protein/autoinducer responsive promoter-cell fate regulator gene-3′LTR

5′LTR-regulatory protein/autoinducer responsive promoter-cell fate regulator gene-IRES-reporter gene-3′LTR

5′LTR-regulatory protein/autoinducer responsive promoter-cell fate regulator gene-IRES-selectable marker-3′LTR

5′LTR-lux promoter-cell fate regulator gene-3′LTR

5′LTR-regulatory protein/autoinducer responsive promoter-Gata4 gene-3′LTR

5′LTR-regulatory protein/autoinducer responsive promoter-Sox17-3′ LTR

5′LTR-promoter-mammalian codon optimized ACP-3′LTR

5′ LTR-promoter-mammalian codon optimized AAS-3′LTR

5′LTR-stage specific promoter-cell fate regulator-3′LTR

5′LTR-AFP promoter-cell fate regulator-3′LTR

5′LTR-stage specific promoter-cell fate regulator-IRES-reporter-3′LTR

5′LTR-AFP promoter-cell fate regulator-IRES-reporter-3′LTR

5′LTR-stage specific promoter-cell fate regulator-IRES-selectable marker-3′LTR

5′LTR-AFP promoter-cell fate regulator-IRES-selectable marker-3′LTR

5′LTR-stage specific promoter-Pdx1 gene-3′LTR

5′LTR-AFP promoter-Pdx1 gene-3′LTR

5′LTR-stage specific promoter-Pdx1 gene-IRES-reporter-3′LTR

5′LTR-AFP promoter-Pdx1 gene-IRES-reporter-3′LTR

5′LTR-stage specific promoter-Pdx1 gene-IRES-selectable marker-3′LTR

5′LTR-AFP promoter-Pdx1 gene-IRES-selectable marker-3′LTR

5′LTR-stage specific promoter-Ngn3 gene-3′LTR

5′LTR-AFP promoter-Ngn3 gene-3′LTR

5′LTR-stage specific promoter-Ngn3 gene-IRES-reporter-3′LTR

5′LTR-AFP promoter-Ngn3 gene-IRES-reporter-3′LTR

5′LTR-stage specific promoter-Ngn3 gene-IRES-selectable marker-3′LTR

5′LTR-AFP promoter-Ngn3 gene-IRES-selectable marker-3′LTR

5′ LTR-stage specific promoter-TetR-3′ LTR

5′LTR-AFP promoter-TetRr-3′LTR

5′LTR-stage specific promoter-TetR-IRES-reporter-3′ LTR

5′LTR-AFP promoter-TetR-IRES-reporter-3′ LTR

5′LTR-stage specific promoter-TetR-IRES-selectable marker-3′LTR

5′LTR-AFP promoter-TetR-IRES-selectable marker-3′LTR

5′LTR-terminal differentiation promoter-repressor-3′ LTR

5′LTR-MIP promoter-repressor-3′LTR

5′LTR-stage specific promoter-repressor-IRES-reporter-3′LTR

5′LTR-MIP promoter-repressor-IRES-reporter-3′LTR

5′LTR-stage specific promoter-repressor-IRES-selectable marker-3′LTR

5′LTR-MIP promoter-repressor-IRES-selectable marker-3′LTR

5′LTR-terminal differentiation promoter-LacI-3′LTR

5′ LTR-MIP promoter-LacI-3′LTR

5′LTR-stage specific promoter-LacI-IRES-reporter-3′LTR

5′LTR-MIP promoter-LacI-IRES-reporter-3′LTR

5′LTR-stage specific promoter-LacI-IRES-selectable marker-3′LTR

5′LTR-MIP promoter-Lac-IRES-selectable marker-3′LTR

In the vectors described above, the 5′ and 3′ LTRs are preferably retroviral LTRs and most preferably lentiviral LTRs. The vectors can preferably comprise additional elements in additional to the listed elements.

G. Cells

The quorum sensing system of the present invention may be introduced into a variety of mammalian cell types. In preferred embodiments, the quorum sensing system is introduced into embryonic or adult stem cells. However, the quorum sensing systems may be introduced into any mammalian cell lines, including, but not limited to, 293 cells, to Chinese hamster ovary cells (CHO-K1, ATCC CCl-61); bovine mammary epithelial cells (ATCC CRL 10274; bovine mammary epithelial cells); monkey kidney CV1 line transformed by SV40 (COS-7, ATCC CRL 1651); human embryonic kidney line (293 or 293 cells subcloned for growth in suspension culture; see, e.g., Graham et al., J. Gen Virol., 36:59 [1977]); baby hamster kidney cells (BHK, ATCC CCL 10); mouse sertoli cells (TM4, Mather, Biol. Reprod. 23:243-251 [1980]); monkey kidney cells (CV1 ATCC CCL 70); African green monkey kidney cells (VERO-76, ATCC CRL-1587); human cervical carcinoma cells (HELA, ATCC CCL 2); canine kidney cells (MDCK, ATCC CCL 34); buffalo rat liver cells (BRL 3A, ATCC CRL 1442); human lung cells (W138, ATCC CCL 75); human liver cells (Hep G2, HB 8065); mouse mammary tumor (MMT 060562, ATCC CCL51); TRI cells (Mather et al., Annals N.Y. Acad. Sci., 383:44-68 [1982]); MRC 5 cells; FS4 cells; rat fibroblasts (208F cells); MDBK cells (bovine kidney cells); and a human hepatoma line (Hep G2).

The present invention is not limited to the use of any particular type of embryonic stem cells. Indeed, the use of embryonic stem cells from a number of animal species is contemplated. Methods for obtaining totipotent or pluripotent cells from humans, monkeys, mice, rats, pigs, cattle and sheep have been previously described. See, e.g., U.S. Pat. Nos. 5,453,357; 5,523,226; 5,589,376; 5,340,740; and 5,166,065 (all of which are specifically incorporated herein by reference); as well as, Evans, et al., Theriogenology 33(1):125-128, 1990; Evans, et al., Theriogenology 33(1):125-128, 1990; Notarianni, et al., J. Reprod. Fertil. 41(Suppl.):51-56, 1990; Giles, et al., Mol. Reprod. Dev. 36:130-138, 1993; Graves, et al., Mol. Reprod. Dev. 36:424-433, 1993; Sukoyan, et al., Mol. Reprod. Dev. 33:418-431, 1992; Sukoyan, et al., Mol. Reprod. Dev. 36:148-158, 1993; Iannaccone, et al., Dev. Biol. 163:288-292, 1994; Evans & Kaufman, Nature 292:154-156, 1981; Martin, Proc Natl Acad Sci USA 78:7634-7638, 1981; Doctschman et al. Dev Biol 127:224-227, 1988); Giles et al. Mol Reprod Dev 36:130-138, 1993; Graves & Moreadith, Mol Reprod Dev 36:424-433, 1993 and Bradley, et al., Nature 309:255-256, 1984.

Primate embryonic stem cells may be preferably obtained by the methods disclosed in U.S. Pat. Nos. 5,843,780 and 6,200,806, each of which is incorporated herein by reference. Primate (including human) stem cells may also be obtained from commercial sources such as WiCell, Madison, Wis. A preferable medium for isolation of embryonic stem cells is “ES medium.” ES medium consists of 80% Dulbecco's modified Eagle's medium (DMEM; no pyruvate, high glucose formulation, Gibco BRL), with 20% fetal bovine serum (FBS; Hyclone), 0.1 mM β-mercaptoethanol (Sigma), 1% non-essential amino acid stock (Gibco BRL). Preferably, fetal bovine serum batches are compared by testing clonal plating efficiency of a low passage mouse ES cell line (ES_(jt3)), a cell line developed just for the purpose of this test. FBS batches must be compared because it has been found that batches vary dramatically in their ability to support embryonic cell growth, but any other method of assaying the competence of FBS batches for support of embryonic cells will work as an alternative.

Primate ES cells are isolated on a confluent layer of murine embryonic fibroblast in the presence of ES cell medium. Embryonic fibroblasts are preferably obtained from 12 day old fetuses from outbred CF1 mice (SASCO), but other strains may be used as an alternative. Tissue culture dishes are preferably treated with 0.1% gelatin (type 1; Sigma).

Recovery of rhesus monkey embryos has been demonstrated, with recovery of an average 0.4 to 0.6 viable embryos per rhesus monkey per month, Seshagiri et al. Am J Primatol 29:81-91, 1993. Embryo collection from marmoset monkey is also well documented (Thomson et al. “Non-surgical uterine stage preimplantation embryo collection from the common marmoset,” J Med Primatol, 23:333-336 (1994)). Here, the zona pellucida is removed from blastocysts by brief exposure to pronase (Sigma). For immunosurgery, blastocysts are exposed to a 1:50 dilution of rabbit anti-marmoset spleen cell antiserum (for marmoset blastocysts) or a 1:50 dilution of rabbit anti-rhesus monkey (for rhesus monkey blastocysts) in DMEM for 30 minutes, then washed for 5 minutes three times in DMEM, then exposed to a 1:5 dilution of Guinea pig complement (Gibco) for 3 minutes.

After two further washes in DMEM, lysed trophectoderm cells are removed from the intact inner cell mass (ICM) by gentle pipetting, and the ICM plated on mouse inactivated (3000 rads gamma irradiation) embryonic fibroblasts. After 7-21 days, ICM-derived masses are removed from endoderm outgrowths with a micropipette with direct observation under a stereo microscope, exposed to 0.05% Trypsin-EDTA (Gibco) supplemented with 1% chicken serum for 3-5 minutes and gently dissociated by gentle pipetting through a flame polished micropipette.

Dissociated cells are replated on embryonic feeder layers in fresh ES medium, and observed for colony formation. Colonies demonstrating ES-like morphology are individually selected, and split again as described above. The ES-like morphology is defined as compact colonies having a high nucleus to cytoplasm ratio and prominent nucleoli. Resulting ES cells are then routinely split by brief trypsinization or exposure to Dulbecco's Phosphate Buffered Saline (without calcium or magnesium and with 2 mM EDTA) every 1-2 weeks as the cultures become dense. Early passage cells are also frozen and stored in liquid nitrogen.

The present invention is not limited to the use of any particular adult stem cell. The adult stem cell is an undifferentiated (unspecialized) cell that is found in a differentiated (specialized) tissue; it can renew itself and become specialized to yield specialized cell types of the tissue from which it originated. These precursor cells exist within the differentiated tissues of the adult of all multicellular organisms. Precursor cells derived from adults can be divided into three categories based on their potential for differentiation. These three categories of precursor cells are epiblast-like stem cells, germ layer lineage stem cells, and progenitor cells. Precursor cells have been isolated from a wide variety of tissues, including, but not limited to, skeletal muscle, dermis, fat, cardiac muscle, granulation tissue, periosteum, perichondrium, brain, meninges, nerve sheaths, ligaments, tendons, blood vessels, bone marrow, trachea, lungs, esophagus, stomach, liver, intestines, spleen, pancreas, kidney, urinary bladder, and testis. Precursor cells can be released from the connective tissue compartments throughout the body by mechanical disruption and/or enzymatic digestion and have been isolated from, but not limited to, newborns, adolescent, and geriatric mice, rats and humans, and adult rabbits, dogs, goats, sheep, and pigs.

The first category of precursor cells, epiblast-like stem cells (ELSCs), consists of a stem cell that will form cells from all three embryonic germ layer lineages. Stem cells from adult rats and stem cells from adult humans can be released from the connective tissue compartments throughout the body by mechanical disruption and/or enzymatic digestion. The stem cells from either adult rats or adult humans can be preferentially slow frozen and stored at −80° C.±5° C. using 7.5% ultra-pure dimethyl sulfoxide. Fast thawing of stem cells from both species from the frozen state to ambient temperature yields recovery rates exceeding 98%. These cells in the undifferentiated state express the Oct-3/4 gene that is characteristic of embryonic stem cells. ELSCs do not spontaneously differentiate in a serum free environment lacking progression agents, proliferation agents, lineage-induction agents, and/or inhibitory factors, such as recombinant human leukemia inhibitory factor (LIF), recombinant murine leukemia inhibitory factor (ESGRO), or recombinant human anti-differentiation factor (ADF). Embryonic stem cells spontaneously differentiate under these conditions. In contrast, ELSCs derived from both species remain quiescent unless acted upon by specific proliferative and/or inductive agents and/or environment.

ELSCs proliferate to form multiple confluent layers of cells in vitro in the presence of proliferation agents such as platelet-derived growth factors and respond to lineage-induction agents. ELSCs respond to hepatocyte growth factor by forming cells belonging to the endodermal lineage. Cell lines have expressed phenotypic markers for many discrete cell types of ectodermal, mesodermal, and endodermal origin when exposed to general and specific induction agents.

The second category of precursor cells consists of three separate stem cells. Each of the cells forms cells of a specific embryonic germ layer lineage (ectodermal stem cells, mesodermal stem cells and endodermal stem cells). When exposed to general and specific inductive agents, germ layer lineage ectodermal stem cells can differentiated into, for example, neuronal progenitor cells, neurons, ganglia, oligodendrocytes, astrocytes, synaptic vesicles, radial glial cells, and keratinocytes.

The third category of precursor cells present in adult tissues is composed of a multitude of multipotent, tripotent, bipotent, and unipotent progenitor cells. In solid tissues these cells are located near their respective differentiated cell types. Progenitor cells do not typically display phenotypic expression markers for pluripotent ELSCs, such as stage specific embryonic antigen-4, stage-specific embryonic antigen-1 or stage-specific embryonic antigen-3, or carcinoembryonic antigen cell adhesion molecule-1. Similarly, progenitor cells do not typically display phenotypic expression markers for germ layer lineage stem cells, such as nestin for cells of the ectodermal lineage or fetoprotein for cells of the endodermal lineage.

A progenitor cell may be multipotent, having the ability to form multiple cell types. A precursor cell of ectodermal origin residing in the adenohypophysisand designated the adenohypophyseal progenitor cell is an example of a multipotent progenitor cell. This cell will form gonadotrophs, somatotrophs, thyrotrophs, corticotrophs, and mammotrophs. Progenitor cells for particular cell lineages have unique profiles of cell surface cluster of differentiation (CD) markers and unique profiles of phenotypic differentiation expression markers. Progenitor cells do not typically spontaneously differentiate in serum-free defined medium in the absence of a differentiation agent, such as LIF or ADF. Thus, unlike embryonic stem cells which spontaneously differentiate under these conditions, progenitor cells remain quiescent unless acted upon by proliferative agents (such as platelet-derived growth factor) and/or progressive agents (such as insulin, insulin-like growth factor-I or insulin-like growth factor-II).

Progenitor cells can regulate their behavior according to changing demands such that after transplantation they activate from quiescence to proliferate and generate both new satellite cells and substantial amounts of new differentiated cells. For example, the contractile units of muscle are myofibers, elongated syncytial cells each containing many hundreds of postmitotic myonuclei. Satellite cells are resident beneath the basal lamina of myofibers and function as myogenic precursors during muscle regeneration. In response to muscle injury, satellite cells are activated, proliferate, and differentiate, during which they fuse together to repair or replace damaged myofibers. When satellite cells are removed from their myofibers by a non-enzymatic physical titration method, they retain their ability to generate substantial quantities of new muscle after grafting that they are not able to attain by enzymatic digestion. Conventional enzymatic disaggregation techniques impair myogenic potential. Collins and Partridge “Self-Renewal of the Adult Skeletal Muscle Satellite Cell” Cell Cycle 4:10, 1338-1341 (2005).

Accordingly, the present invention also contemplates the use of non-embryonic stem cells, such as those described above. In some embodiments, mesenchymal stem cells (MSCs) can be derived from marrow, periosteum, dermis and other tissues of mesodermal origin (See, e.g., U.S. Pat. Nos. 5,591,625 and 5,486,359, each of which is incorporated herein by reference). MSCs are the formative pluripotential blast cells that differentiate into the specific types of connective tissues (i.e. the tissues of the body that support the specialized elements; particularly adipose, areolar, osseous, cartilaginous, elastic, marrow stroma, muscle, and fibrous connective tissues) depending upon various in vivo or in vitro environmental influences. Although these cells are normally present at very low frequencies in bone marrow, various methods have been described for isolating, purifying, and greatly replicating the marrow-derived mesenchymal stems cells in culture, i.e. in vitro (See also U.S. Pat. Nos. 5,197,985 and 5,226,914 and PCT Publication No. WO 92/22584, each of which are incorporated herein by reference).

Various methods have also been described for the isolation of hematopoietic stem cells (See, e.g., U.S. Pat. Nos. 5,061,620; 5,750,397; 5,716,827 all of which are incorporated herein by reference). It is contemplated that the methods of the present invention can be used to produce lymphoid, myeloid and erythroid cells from hematopoietic stem cells. The lymphoid lineage, comprising B-cells and T-cells, provides for the production of antibodies, regulation of the cellular immune system, detection of foreign agents in the blood, detection of cells foreign to the host, and the like. The myeloid lineage, which includes monocytes, granulocytes, megakaryocytes as well as other cells, monitors for the presence of foreign bodies in the blood stream, provides protection against neoplastic cells, scavenges foreign materials in the blood stream, produces platelets, and the like. The erythroid lineage provides the red blood cells, which act as oxygen carriers.

Accordingly, the present invention also contemplates the use of neural stem cells, which are generally isolated from developing fetuses. The isolation, culture, and use of neural stem cells are described in U.S. Pat. Nos. 5,654,183; 5,672,499; 5,750,376; 5,849,553; and 5,968,829, all of which are incorporated herein by reference. It is contemplated that the methods of the present invention can use neural stem cells to produce neurons, glia, melanocytes, cartilage and connective tissue of the head and neck, stroma of various secretory glands and cells in the outflow tract of the heart.

In some embodiments, the quorum sensing systems are incorporated into cord blood cells. Transplantation of umbilical-cord blood has been successfully performed to treat individuals with blood-diseases; donors, used have been newborn siblings being perfect HLA matches for the affects sibling. The advantages of cord blood as a source of hematopoietic stem cells for transplantation are clear. First, the proliferative capacity of hematopoietic stem cells in cord blood is superior to that of cells in marrow or blood from adults. Because they proliferate rapidly, the stem cells in a single unit of cord blood can reconstitute the entire hematopoietic system. Second, the use of cord blood reduces the risk of graft-versus-host disease, the main obstacle to the success of allogeneic transplantation of hematopoietic stem cells. Graft-versus-host disease is caused by a reaction of T cells in the graft to HLA antigens in the recipient; the immaturity of lymphocytes in cord blood dampens that reaction. A joint European study showed that recipients of cord blood from HLA-identical siblings had a lower risk of acute or chronic graft-versus-host disease than recipients of marrow from HLA-identical siblings. Children with acute leukemia who received HLA-mismatched cord blood from an unrelated donor also had a lower risk of graft-versus-host disease than recipients of HLA-mismatched marrow from an unrelated donor (Hematopoietic stem-cell transplants using umbilical-cord blood, New England Journal of Medicine. 2001, 344(24):1860-1861, editorial). Cord blood cells from siblings or children with matching HLA could be used to produce cell lines for use as contemplated by this invention.

H. Treatment Methods

In preferred embodiments, cells incorporating the vectors or systems of vectors described are introduced into a subject in need of treatment. In some embodiments, where the subject is diabetic, synthetic β cells comprising the quorum sensing system described above are introduced into the diabetic subject. In preferred embodiments, the synthetic β cells produce insulin. It will be recognized that the systems described above can be adapted to cause the differentiation of embryonic stem cells, adult stem cells and cord blood stem cells into a variety of cell types that can be utilized for therapeutic purposes, including neurons, chondrocytes, myocytes, and keratinocytes of various types.

In some embodiments, banks of cells comprising the quorum sensing systems described above are provided. In preferred embodiments, the banks of cells include cell lines that are programmed to attain a particular differentiated states, such as β cells. In some embodiments, the banks of cells comprise multiple cell lines expressing different combinations of HLA antigens. It is contemplated that such banks will result in an increased likelihood of obtaining a 6-6, 10-10 or greater match for a particular subject.

I. Identification of Additional Genes Involved in Beta Cell Differentiation

In some embodiments, the systems of the present invention are used to identify genes involved in pancreatic β cell differentiation using RNAi knockdown assays. In preferred embodiments, the assays utilize a large library of shRNAs comprising >100,000 hairpins which target ˜21,000 human and ˜17,000 mouse genes, 30,000 of which are cloned into a tet-inducible microRNA-embedded shRNA lentiviral vector.

The RNAi knockdown assay is illustrated in FIG. 4A-4B. First, engineered ES cells are transduced with a lentiviral shRNA library at an MOI of 1, so that each post-transduction cell will have one shRNA. After quorum sensing and two-step differentiation occurs in the transduced cells, three unique pools of cells emerge in the population. The first consists of cells in which shRNA interferes with the differentiation of mES cells to endoderm, causing cells to retain stem cell character and proliferate rapidly. The second pool consists of cells in which shRNA interferes with the differentiation of endodermal cells to P cells, causing cells to retain endoderm character and proliferate more slowly than the stem cells. The last pool consists of cells in which shRNA does not interfere with the two-step differentiation process from mES cells, resulting in slowly- or non-dividing β cells. Two populations of cells, a sample of the initial population after transduction and a sample of the population directed to differentiate, will be subjected to microarray analysis. Comparisons of the relative ratios of individual shRNA will allow identification of genes that are involved at each step of the differentiation process.

EXAMPLES Example 1

293FT cells that were genetically engineered to synthesize 30C6HSL were grown in liquid media, and this media was found to contain 30C6HSL (FIG. 2a ). The 30C6HSL detection pathway consists of a signal transducer that binds 30C6HSL and activates transcription of genes controlled by a synthetic lux promoter. The 30C6HSL signal transducer is a chimeric LuxR-activator protein created by the fusion of a P65 mammalian activation domain to a codon optimized mammalian version of a bacterial LuxR. Initial testing of the mammalian version of LuxR in 293FT cells shows a strong response to the addition of 30C6HSL (FIG. 2b ).

Example 2

Synthetic Gene Networks in Mammalian Cells—the rtTA Switch.

A reverse Tetracycline-controlled transactivator (rtTA) switch, where gene expression is upregulated by the addition of Doxycycline (Dox), has been implemented, rtTA is constitutively expressed in the Ainv15 mES cell line and activates transcription of a give cell fate regulator (CFR) and a EGFP as control from the TRE promoter in the presence of Dox. A dosage response curve (not shown) demonstrates that the expression level (as measured by flourescence depending of the Dox concentration.

Controlled Induction of Cell Fate Regulators in mES.

Ngn1, MyoD and Nanog Ainv15 mES stem cells constitutively expressing the Dox-inducible circuit can be infected with a virus encoding Ngn1/EGFP or MyoD/EGFP under the control of a TRE promoter. The cell maintains self-renewal in the absence of Dox, while the presence of Dox results in the differentiation into either cells with a neuronal morphology (Ngn1), muscle cell morphology (MyoD).

Matrigel-Embedded mES—MyoD Expression.

To verify the conditions required for mES cell growth and differentiation in semi-solid media, embedded mES cells were infected with the Dox inducible circuit encoding MyoD/EGFP in Matrigel and subsequently induced with 1 mg/ml Dox over several days. Cells not induced with Dox grow, do not express EGFP and maintained stem cell morphology in a Matrigel matrix. The addition of Dox results in EGFP expression and formation of multinucleated syncitia 60 hours post induction, which is one key development step in formation of muscle fibers.

Example 3

293FT and CHO cells were infected with virus encoding pLV-pTat-IRES2-EGFP, and then exposed three different types of TAT communication molecules with various secretion tags. The 293FT cells are able to internalize the TAT protein, while CHO cells cannot. In the experiment, sender cells were grown for 2 days, and then receiver 293FT cells were grown in the supernatant. As expected, the supernatant from 293FT cells that can internalize the TAT molecule did not yield much communication, while supernatant from the CHO cells was able to active GFP expression significantly in the receiver cells.

Example 4

This example describes the assessment of the transactivatory properties of TAT and the functionality of the signaling modules by having sender and receiver modules in the same cell. Sender cells expressing TAT containing a secretion signal and receiver cells containing the detection module separated by a permeable barrier (e.g. transwell inserts), are used to validate and optimize the cell-cell transducing capabilities of TAT. A hemagglutinin tag (HA-tag) is added as a translational fusion to the C-terminus of TAT, to confirm its expression by means of immunofluorescence (IF) and western blotting (WB).

Engineering TAT Internal Signaling:

By having receiver and mock TAT sender (no secretion signal for export) in the same cell, the transactivatory properties of TAT as well as the detector functionality is validated. As described above, the rtTA switch, where gene expression is upregulated by the addition of Dox, is operational. Full-length HIV-1 TAT with a C-terminal HA-tag is placed under the control of the TRE promoter, enabling induction of TAT expression by adding Dox to the cell culture media. The receiver contaisn the wild-type HIV-1 pTAT promoter with the TAR element driving expression of EGFP upon TAT binding. To test, the TAT transactivation capabilities, Dox is added to cells, whereupon TAT expression is induced. TAT binds to pTAT and induces the expression of EGFP, which is detected by FACS and/or fluorescence microscopy (FIG. 26A-26B).

Engineering TAT Senders:

By co-infecting the rtTA switch and the TAT gene with an N-terminal secretion tag and an C-terminal HA-tag under control of the pTRE promoter into cells, Dox-inducible secretion of TAT is achieved (FIG. 26b ). To get a high level of secretion to the extracellular milieu, the N-terminal export signal sequence present in IL-2 to the N-terminus of TAT is added. To verify protein expression and export, cell lysates and cell supernatants is checked by a WB against the HA tag. Low levels of TAT in the supernatant can be concentrated by precipitation of the protein fraction with triacetic acid.

Engineering TAT Receivers:

By infecting the module containing the pTAT promoter controlling the expression of EGFP, TAT-inducible expression is of EGFP achieved (FIG. 26b ). Using the TAT senders secreting the protein into the supernatant described in the previous paragraph, EGFP expression in receivers is detectable by fluorescence microscopy or FACS. The uptake of TAT by receiver cells will also be confirmed by IF staining against the HA-tag. Using transwell inserts as a permeable barrier to keep sender and receiver cells separate, cell-cell communication of the secreted TAT can be assessed by measuring the EGFP protein expression levels in the receiver cells (FACS/fluorescence microscopy).

The presence of TAT in the sender cell, in the supernatant, in the cytoplasm or nucleus of the receiver cell will be confirmed by IF and WB against the HA-tag in TAT. Recently, reports have suggested the release from macropinosomes is the limiting factor in the transduction of cells by using a HIS-tag purified TAT-peptide/Cre fusion protein. The addition of an N-terminal 20 amino acids of the influenza virus hemagglutinin protein HA2—which is a fusogenic peptide that destabilizes lipid membranes at low pH in mature endosomes—to the C-terminus of the TAT-peptide, markedly increased its release from the macropinosomes, and subsequently the effectiveness of transduction. As a consequence, the addition of this fusogenic peptide might also increase the effectiveness of cell-cell transduction.

If TAT receiver sensitivity is too low, a TAT-Cre fusion protein with a secretion signal in the sender cells will be made and used to transduce a reporter module into receiver cells where the expression of EGFP from a constitutive promoter is blocked by a terminator with two loxP sites. In principle, the translocation of one TAT-Cre molecule into the nucleus of a receiver cell should be sufficient to trigger the Cre mediated recombination/removal of the terminator, hence full expression of EGFP. Such a TAT-Cre fusion expressing module could also be very useful in tracing the dispersion of TAT in vivo, by using a reporter mouse cell line expressing EGFP upon recombination of a terminator located in its promoter.

Example 5

This example describes the assessment of the effectiveness of in silico designed ZFP-DTS pairs and the functionality of the signaling modules by having sender and receiver modules in the same cell. In a second step, sender cells expressing TAT-ZFP fusions and receiver cells containing the detection module separated by a barrier (e.g. transwell inserts) are used to check and optimize the cell-cell transducing capabilities of TAT-ZFP fusion proteins with their cognate DTS.

Published in silico designed ZFP-DTS pairs, as for example Jazz, EPOZFP-862c and ZFP-809 binding to the DTSs “GCTGCTGCG”, “GCGGTGGCT”, “G(G/c)GGG(T/a)G(A/g)C” (5-fold, 15-fold and 46-130 fold induction of reporter genes respectively) (37-39) will be used first to demonstrate the functionality of my reporter system. Subsequently, novel in silico ZFP-DTS pairs are designed.

Engineering ZFP-DTS Pairs (in Silica):

Described ZF—nucleotide triplet interactions will be used to assemble novel ZFP-DTS pairs in silico by using existing knowledge on how to engineer translational fusions of such single zinc finger to a ZFP, which is also integrated into the online ZFP design toolset “ZF Tools” developed by the Barbas group (Mandell, J. G. & Barbas, C. F., 3rd (2006) Nucleic acids research 34, W516-523; Mandell, J. G. (2006) (on the world wide web at scripps.edu/mb/barbas/zfdesign/zfdesignhome.php), p. Zinc Finger Tools Version 3.0). Care will be taken to design ZFP-DTS pairs for which the DTS has no significant homologies in the human or mouse genome (nucleotide blast against the genome sequence), to reduce the probability of ZFP binding to endogenous sequences.

Engineering the Internal ZFP-DTS Signaling:

By having receiver and mock sender (no secretion signal for export) in the same cell, the transactivatory properties of the ZFP-DTS pairs as well as the detector functionality will be validated. As described above, the reverse Tetracycline-controlled transactivator (rtTA) switch, where gene expression is upregulated by the addition of Dox is operational. For the mock secretion module, a TAT-ZFP-VP64 fusion (VP64 as a strong enhancer) will be placed under the control of the TRE promoter, enabling induction of TAT-ZFP-VP64 expression by adding Dox to the cell culture media. The receiver module contains a minimal CMV promoter with the cognate DTS for the ZFP driving the expression of EGFP upon TAT-ZFP-VP64 binding. To test the transactivation capabilities, Dox is added to cells containing the artificial network, whereupon TAT-ZFP-VP64 is expressed, exported to the extracellular milieu by the virtue of its N-terminal IL-2 signal peptide, re-enter the cell with the help of the transducing TAT domain and localize to the nucleus where binds to its cognate DTS. The binding of fusion protein to the minimal CMV promoter will induce expression of EGFP (detectable by FACS and/or fluorescence microscopy) (FIG. 27a ).

Engineering ZFP Senders:

By co-infecting the rtTA switch and the TAT-ZFP-VP64 fusion gene with a secretion signal under control of the TRE promoter into cells, Dox-inducible secretion of TAT-ZFP-VP64 is achieved (FIG. 27b ). This fusion protein is exported by the virtue of its N-terminal secretion system to the extracellular milieu, where it can be detected by using receiver cells, or, at an earlier stage, by WB. The expression of the fusion protein in the sender cells can be verified by WB of cell lysates or IF staining against the HA-tag.

Engineering ZFP Receivers:

By infecting the module containing the minimal CMV promoter with the DTS for the cognate ZFP controlling the expression of EGFP, TAT-ZFP-VP64-inducible expression of EGFP is achieved (FIG. 27b ). Specifically, the secretion signal-containing TAT-ZFP-VP64 fusion protein is exported by sender cells separated by a transwell membrane into in the cell supernatant will transduce into receiver cells, localize to the nucleus and bind its cognate DTS in the minimal CMV promoter, thereby expressing EGFP (detectable by FACS and/or fluorescence microscopy). The uptake of the fusion protein will also be tracked by IF staining against the HA-tag.

If the engineering of cognate ZFP-DBP pairs and/or the export of ZFPs is problematic, the ZFP will be replaced by Gal4 and use the well-described Gal4 DTS in the receiver cells. This library could then be expanded by replacing Gal4 with known DNA-binding proteins/transcription factors (e.g. LacI, cI, TetR, rtTA) and their cognate DTS on the receiver side. Also, different endocytic domains and different secretion tags (from IL-I1, GM-CSF . . . ) can be used.

Full length TAT might bind to endogenous regulatory elements, where it would have undesirable effects on the expression of endogenous genes. By using only a minimal TAT-peptide of 10 amino acids needed for cellular uptake and nuclear localization, it should be possible to reduce these undesirable interactions. Alternatively, other transducing/NLS-containing peptide sequences could be used in the delivery, as for example the 34 amino acid sequence of the Beta2/NeuroD transcription factor or the third a-helix of the Antennapedia homeoprotein.

Example 6

This example demonstrates that the artificial cell-cell communication signals described in Example 5 can be used to induce differentiation of mouse embryonic stem cells in vitro. Two types of sender cells will express TAT-ZFP-VP64 or TAT-ZFP2-VP64. ZFP1 activates MyoD expression, while ZFP2 activates Ngn1 expression. By placing patches of sender cells on a Matrigel matrix with embedded receiver cells, the gradient of the two synthetic morphogens should induce differentiation of the receiver cells into a myoblast- or neuron-like morphology, depending on their location.

Engineering Senders:

By infecting the rtTA switch, the TAT-ZFP-VP64 fusion expressing module from above and additionally a module with a distinct ZFP (TAT-ZFP2-VP64), two sender cell types are created expressing TAT-ZFP-VP64 or TAT-ZFP2-VP64 upon addition of Dox.

Engineering Receivers:

By co-infecting the two modules containing a DTS1 or DTS2, a minimal CMV promoter, the myoD or ngn1 genes, an internal ribosome entry site (IRES) and a egfp or dsred2 gene into receiver cells, they will express MyoD and EGFP in response to ZFP1, Ngn1 and DsRed2 in response to ZFP2. These signaling circuits will be first tested with mES in liquid cell culture. If the communication and differentiation in the liquid media with transwell insert works, the receiver mES cells will be embedded in a Matrigel matrix and the sender cells filled into holes of various diameter at the surface of this matrix. This to induce three-dimensional differentiation patterns in the receiver cells. The differentiation success in both liquid as well matrix approach can be assessed by morphological changes, IF stainings, or RT-PCR for relevant markers (e.g. Desmin, Mef2, Cx2 for myocyte assays; Lim 1/2, Map2 and NSE for neural assays)

Other substrates than Matrigel might provide better growth and differentiation capabilities for mES, as for example a collagen-agarose matrix. The diffusion properties of the TAT-ZFP fusions proteins will be critical, as well as the positioning and the number the sender cells on the matrix. Suboptimal expression of MyoD/Ngn1 could be overcome by incorporating signal amplifying circuits into the receiver cells. Crosstalk between the ZFP-DTS pairs should have been ruled out earlier in the ZFP-DTS optimization procedure, nevertheless the ZFP and/or DTS coding sequences could be exchanged for other variants. 

1-77. (canceled)
 78. A composition comprising one or more mammalian vectors that comprise: a) a first nucleic acid sequence capable of producing a first cell fate regulator protein that is capable of inducing differentiation of a first cell type into a second cell type that expresses a protein marker, and b) a second nucleic acid sequence capable of producing a second cell fate regulator protein that is operably linked to a cell type specific promoter of said second cell type, and that is capable of inducing differentiation of said second cell type into a third cell type
 79. The composition of claim 78, wherein said one or more vectors further comprises c) a third nucleic acid sequence capable of producing a third cell fate regulator protein that is operably linked to a cell type specific promoter of said second cell type, and that is capable of inducing differentiation of said second cell type into said third cell type.
 80. The composition of claim 78, wherein said composition is comprised in a mammalian cell, wherein said vectors are exogenous to said mammalian cell.
 81. The composition of claim 78, wherein said first cell fate regulator protein is selected from the group consisting of Sox17, Gata4, Gata6, Pdx1, Ngn3, Nkx6.1, Nkx2.2, Fgf4, BRA, Wnt9, NCAD, CER, FoxA2, CxcR4, Hnf1B, Hnf4A, Hnf6, HlxB9, Pax4, Cgc, GHRL, SST, PPY, Activin, Fgf10, Cyc, RA, Ex4, DAFT, HGF and Igf1.
 82. The composition of claim 79, wherein one or more of said second cell fate regulator protein and said third cell fate regulator protein is selected from the group consisting of Pdx1, Ngn3, Nkx6.1, Nkx2.2, Fgf4, BRA, Wnt9, NCAD, CER, FoxA2, CxcR4, Hnf1B, Hnf4A, Hnf6, HlxB9, Pax4, Cgc, GHRL, SST, PPY, Activin, Fgf10, Cyc, RA, Ex4, DAPT, HGF and Igf1.
 83. The composition of claim 78, wherein one or more of said first cell type and said second cell type is an epiblast-like stem cell (ELSC) and said protein marker is selected from the group consisting of stage-specific embryonic antigen-4, stage-specific embryonic antigen-1, stage-specific embryonic antigen-3, and carcinoembryonic antigen cell adhesion molecule-1.
 84. The composition of claim 83, wherein said second cell type is a myoblast cell, and said protein marker is dystrophin.
 85. The composition of claim 83, wherein said second cell type is an adipocyte cell, and said protein marker is PPAR(.
 86. The composition of claim 83, wherein said second cell type is an endoderm cell, and said protein marker is selected from the group consisting of Hnf3∃, lamininB1, and ∀-fetoprotein (AFP).
 87. The composition of claim 83, wherein said second cell type is an ectoderm cell and said protein marker is nestin.
 88. The composition of claim 78, wherein said first cell fate regulator protein comprises Gata4.
 89. The composition of claim 78, wherein said second cell fate regulator protein comprises Pdx1.
 90. The composition of claim 78, wherein said second cell fate regulator protein comprises Ngn3 and Pdx1. 91-125. (canceled) 